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RESULT 1 
AAU11635 

ID AAU11635 standard; protein; 330 AA. 
XX 

AC AAU11635; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human Neuregulin-2alpha, NRG-2alpha. 
XX 

KW Human; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; cell survival; 

KW cell growth; cell differentiation; erbB receptor; cardiomyopathy; 

KW ischaemic damage; cardiac trauma; heart failure; atherosclerosis; 

KW vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 



KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens. 
XX 

PN WO200189568-A1. 
XX 

PD 2 9-NOV-2001- 
XX 

PF 23-MAY-2001; 2001WO-US0168 96 . 
XX 

PR 23-MAY-2000; 2000US-0206495P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR N-PSDB; AAS18019. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating multiple 

PT sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's disease, by 

PT increasing mitogenesis, survival, growth or differentiation of a cell. 
XX 

PS Claim 53; Fig 7; 7 9pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human NRG-2alpha 

CC or NRG-2beta (clone 2b7) and the polynucleotides encoding the. Also 

CC included are a vector expressing the protein, a host cell comprising the 

CC vector, a transgenic non-human animal transformed with the vector or 

CC having a knockout mutation in one or both NRG-2 alleles and an anti-NRG-2 

CC antibody. Analysis of mutations in NRG-2 in an individual is useful for 

CC diagnosing an increased likelihood of developing a NRG-2-related disease 

CC or condition in a test subject. NRG-2 is useful for increasing the 

CC mitogenesis, survival, growth or differentiation of a cell (e.g. a 

CC neuronal cell), where the cell expresses an erbB receptor. NRG-2 is 

CC useful for treating diseases and disorders such as cardiomyopathy 

CC (preferably degenerative congenital disease), ischaemic damage, cardiac 

CC trauma or heart failure or which has a condition affecting smooth muscle 

CC which include atherosclerosis, vascular lesion, vascular hypertension, 

CC and degenerative congenital vascular disease, myasthenia gravis, a 

CC neurodegenerative disorder, peripheral neuropathy, a sensory nerve fiber 

CC neuropathy, a motor fiber and a sensory nerve fiber neuropathy, multiple 

CC sclerosis, amyotrophic lateral sclerosis, spinal muscular atrophy, nerve 

CC injury, Alzheimer's disease, Parkinson's disease, cerebellar ataxia, and 

CC spinal cord injury. The antibody is useful for treatment of a tumour 

CC comprising inhibiting proliferation of a tumour cell preferably a glial 

CC tumour cell, for treating of neurofibromatosis by inhibiting glial cell 

CC mitogenesis. The present sequence represents NRG-2alpha 

XX 

SQ Sequence 330 AA; 



Query Match 



100.0%; Score 1749; DB 5; Length 330; 



Best Local Similarity 100.0%; Pred. No. l.le-108; 

Matches 330; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I 1 I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEA/^AGNPQPSYRWFK 180 

I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

Qy 301 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 301 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 



RESULT 2 
ABB07894 

ID ABB07894 standard; protein; 422 AA. 
XX 

AC ABB07894; 
XX 

DT 03-JUL-2002 (first entry) 
XX 

DE Human neuregulin 2 isoform 6. 
XX 

KW Human; MUC1; mucin; glycoprotein; cytostatic; cancer; tumour; ECD; 

KW extracellular domain; neuregulin 2; isoform. 

XX 

OS Homo sapiens. 
XX 

PN WO200222685-A2. 
XX 

PD 21-MAR-2002. 
XX 

PF ll-SEP-2001; 2 001WO-US02 8 54 8 . 
XX 

PR ll-SEP-2000; 2000US-0231841P . 
XX 

PA (KUFE/) KUFE D W. 
PA (OHNO/) OHNO T. 
XX 

PI Kufe DW, Ohno T; 
XX 

DR WPI; 2002-339864/37. 



XX 
PT 
PT 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 
SQ 



Use of a mucin glycoprotein (MUC1) extracellular domain antagonist for 
manufacturing a medicant that inhibits the proliferation of MUC-1 
expressing cancer cells and that can treat cancers and reduce tumor 
growth . 

Claim 6; Page 56-58; 74pp; English. 

The invention relates to the use of a MUCl (mucin glycoprotein) 
extracellular domain (ECD) antagonist for the manufacture of a medicant 
to inhibit the proliferation of MUC-1 expressing cancer cells. MUCl ECD 
antagonists (optionally combined with a pharmaceutical carrier) can be 
administered to inhibit proliferation of MUCl-expressing cancer cells, 
useful to treat cancers e.g. skin cancer, prostate cancer and leukemia, 
especially in humans . The method may also be combined with administration 
of a chemotherapeutic agent (e.g. an alkylating agent, topisomerase etc) 
or radiation to treat cancer, especially to reduce tumour growth. The 
polypeptides are also useful in screening to identify MUCl ECD 
antagonists. The present sequence represents a human neuregulin 2 isoform 
6, a fragment of which can bind to MUC1/ECD 

Sequence 422 AA; 



Query Match 100.0%; Score 1749; DB 5; Length 422; 

Best Local Similarity 100.0%; Pred. No. 1.5e-108; 

Matches 330; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 



93 



MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPWVEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I 1 I I I I I I II I I I I I I I I I I I I 
MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 



Qy 

Db 

Qy 

Db 

Qy 

Db 



61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

M I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
153 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 



121 



180 



PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 

I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I II 
27 3 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 



Qy 

Db 

Qy 

Db 



241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I i I I I 
333 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 392 



301 



330 



DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 

I I I I I I I I I I II I I I I I I I I I I I I I II I I I 
393 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 422 



RESULT 3 
AAW27537 

ID AAW27537 standard; protein; 330 AA. 
XX 

AC AAW27537; 



XX 






T\rn 


18-DEC-1997 (first entry) 


XX 






DE 


Rat cerebellum 


derived growth factor 2 . 


XX 






KW 


Rat; cerebellum 


derived growth factor; CDGF2 ; screening; binding; 


KW 


modulation; erbB type receptor; identification; indication; risk; 


KW 


proliferation; 


differentiation; induction; neuron; hyperplasia; 




stem cell culture; intracerebral graft; alleviation; repair; 


KW 


behavioural defect; nervous system; central; peripheral; nerve; 




prothesis; damage; entubulation; cell survival; treatment; injury; 


KW 


trauma; ischaemia; ischemia; stroke; infection; disorder; inflammati< 


KW 


neurodegeneration; disease ; Parkinson ' s ; Huntingdon 1 s ; 


KW 


amylotrophic lateral sclerosis; sensory; retina; 


KW 


spinocerebellar 


degeneration; multiple sclerosis; neoplasia; 


VTaT 
KW 


amalignant glioma; medulloblastoma ; neuroectodermal tumour. 


vv 

XX 






OS 


Rattus rattus . 
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FH 
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FT 
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FT 
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r 1 
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rrrn 
£ i 




/ IdJJCl — kJ -L -L IllctJ- U LUW Lil laLLUl J L j\. ~ ^lv-/iLLCl _L 1 1 


C 1 
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0 s ? 

£. J -J 


FT 




/nrtfa — "rha rarhori Q"t~i c r*\7 e*\ Y\ O "F o"Pi i Hp TTTl^ 1 (T fOWt h 


FT 




IdCLOI 11A.C tKJIIlclXIl 


FT 


Region 


D ^ A 


FT 




/note= "potential N-glycosylation site" 


FT 


Region 


261 


FT 




/note= "characteristic cysteine of epidermal growth 


FT 




factor like domain" 


FT 


Region 


267 


C i 




/note= "characteristic cysteine of epidermal growth 


FT 




factor like domain" 


FT 


Region 


278 


rprp 
E 1 




/note= "characteristic cysteine of epidermal growth 


FT 




factor like domain" 


FT 


Region 


280 


FT 




/note= "characteristic cysteine of epidermal growth 


rprp 

r 1 




factor like domain" 


FT 


Region 


289 






/note= "characteristic cysteine of epidermal growth 


FT 




factor like domain" 


XX 






PN 


WO9709425-A1. 




XX 






PD 


13-MAR-1997 . 




XX 







PF 09-SEP-1996; 96WO-US014484 . 
XX 

PR 08-SEP-1995; 95US-00525864 . 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAND S STANFORD . 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR N-PSDB; AAT87923. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the treatment 

PT of neuronal injury and proliferative disorders. 

XX 

PS Claim 1; Page 70-71; 94pp; English. 
XX 

CC The present sequence is rat cerebellum derived growth factor 2 (CDGF2), 

CC which can be used to screen for modulators of CDGF binding to erbB type 

CC receptors. Identification of a modification or mutation in a CDGF gene, 

CC or aberrant expression of a CDGF gene or levels of soluble CDGF may be 

CC used to indicate the risk of unwanted cell proliferation or 

CC differentiation. CDGF may be used to induce neuronal differentiation in 

CC stem cell culture, and maintain the integrity of a terminally 

CC differentiated neuronal cell culture, e.g. useful for intracerebral 

CC grafting to alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially where 

CC a crushed or severed axon is entubulated by a prosthetic. CDGF may also 

CC be used to enhance neuronal cell survival in the central or peripheral 

CC nervous system, to treat neurological conditions associated with nervous 

CC system injury, e.g. traumatic, chemical or vasal injury and deficits such 

CC as ischaemia resulting from stroke, infectious/inflammatory and tumour 

CC induced injury, chronic neurodegenerative disease including Parkinson f s 

CC and Huntingdon 1 s, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the central 

CC nervous system, e.g. amalignant gliomas, medulloblastomas and 

CC neuroectodermal tumours 

XX 

SQ Sequence 330 AA; 



Query Match 98.5%; Score 1722; DB 2; Length 330; 

Best Local Similarity 97.9%; Pred. No. 7.1e-107; 

Matches 323; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRD P AP GFSMLLFGVS LAC YSPSLKSVQDQAYKAPWVEGKVQGL VP AGGSSSN STREP 60 

I M I I I I I I II M I I I I I I I II M I I I I I I M I I I I I I I I I I I II I I II I I II I I M M 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I M I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 



Qy 



121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 
I : I I I I I : I II I I I I I I I I I I I I I I II I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 



181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 




Db 



181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 240 



Qy 



241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 




Db 



241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 



Qy 



301 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 




Db 



301 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 



RESULT 4 
ABB07893 

ID ABB07893 standard; protein; 426 AA. 
XX 

AC ABB07 8 93; 
XX 

DT 03-JUL-2002 (first entry) 
XX 

DE Human neuregulin 2 isoform 5. 
XX 

KW Human; MUCl; mucin; glycoprotein; cytostatic; cancer; tumour; ECD; 

KW extracellular domain; neuregulin 2; isoform. 

XX 

OS Homo sapiens. 
XX 

PN WO200222685-A2. 
XX 

PD 21-MAR-2002 . 
XX 

PF ll-SEP-2001; 2001WO-US028548. 
XX 

PR ll-SEP-2000; 2000US-0231841P. 
XX 

PA (KUFE/) KUFE D W. 

PA (OHNO/) OHNO T. 
XX 

PI Kufe DW, Ohno T; 
XX 

DR WPI; 2002-339864/37. 
XX 

PT Use of a mucin glycoprotein (MUCl) extracellular domain antagonist for 

PT manufacturing a medicant that inhibits the proliferation of MUC-1 

PT expressing cancer cells and that can treat cancers and reduce tumor 

PT growth. 
XX 

PS Claim 6; Page 53-55; 74pp; English. 
XX 

CC The invention relates to the use of a MUCl (mucin glycoprotein) 

CC extracellular domain (ECD) antagonist for the manufacture of a medicant 

CC to inhibit the proliferation of MUC-1 expressing cancer cells. MUCl ECD 

CC antagonists (optionally combined with a pharmaceutical carrier) can be 

CC administered to inhibit proliferation of MUCl-expressing cancer cells, 



CC useful to treat cancers e.g. skin cancer, prostate cancer and leukemia, 

CC especially in humans. The method may also be combined with administration 

CC of a chemotherapeutic agent (e.g. an alkylating agent, topisomerase etc) 

CC or radiation to treat cancer, especially to reduce tumour growth. The 

CC polypeptides are also useful in screening to identify MUC1 ECD 

CC antagonists. The present sequence represents a human neuregulin 2 isoform 

CC 5, a fragment of which can bind to MUC1/ECD 
XX 

SQ Sequence 426 AA; 

Query Match 98.3%; Score 1720; DB 5; Length 426; 

Best Local Similarity 100.0%; Pred. No. 1.3e-106; 

Matches 324; Conservative 0; Mismatches 0; Indels 0; Gaps 0, 
1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I 1 I I I I II I I I I II I I I I II I I I I I I I I I I M M I! 1 I I I I I I I M 1 I I I I I 

93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSN STREP 152 

61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 
I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 1 I M M I I I ! M I I I I 
153 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

| | | M | | | I I I I II I I I I I M I I I I I II I I I I I I I I I I II M I I II I I I I I I I I I I I I I I 
213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

| I I I M I I I I M I I II I I I M I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
273 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

241 TTLSSWSGHARKCNETAKSYCWGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

| || | | | I I I I I I I I I II II I I II I I I I I I I I I I I M I I I I I I I I I I I I M I I I I II I I I I 
333 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 392 

301 DPKQSVLWDTPGTGVSSSQWSTSP 324 

I I I I I I I I I I I I I I I I I II I I I I I 
393 DPKQSVLWDTPGTGVSSSQWSTSP 416 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



RESULT 5 


AAW63700 


ID 


AAW63700 standard; protein; 860 AA. 


XX 




AC 


AAW63700; 


XX 




DT 


29-SEP-1998 (first entry) 


XX 




DE 


Receptor type tyrosine kinase ErbB ligand. 


XX 




KW 


Receptor type tyrosine kinase ErbB; ligand; diagnostic agent; 


KW 


nervous disease; cancer. 


XX 




OS 


Rattus sp. 


XX 




PN 


JP10179166-A. 


XX 




PD 


07-JUL-1998 . 



XX 

PF 25-DEC-1996; 96 JP-00356998 . 
XX 

PR 25-DEC-1996; 96 JP-00356998 . 
XX 

PA (HIGA/) HIGASHIYAMA S. 
XX 

DR WPI; 1998-430952/37. 

DR N-PSDB; AAV43674. 
XX 

PT Gene coding the ligand of the tyrosine kinase ErbB receptor - useful for 

PT diagnosing and treating nervous diseases and cancer. 

XX 

PS Claim 1; Page 9-13; 17pp; Japanese. 
XX 

CC This represents the ligand of receptor type tyrosine kinase ErbB. A 

CC prokaryotic or eukaryotic host cell transformed by a recombinant vector 

CC containing the encoding DNA can be used for the recombinant production of 

CC the protein. The invention provides a method for inhibiting the formation 

CC of the ligand of receptor type tyrosine kinase ErbB in an animal using an 

CC antibody recognizing the protein. The ligand of the tyrosine kinase ErbB 

CC receptor and associated materials can be used for treating or diagnosing 

CC nervous diseases and cancers 
XX 

SQ Sequence 860 AA; 

Query Match 90.1%; Score 1575; DB 2; Length 860; 

Best Local Similarity 97.4%; Pred. No. 1.3e-96; 

Matches 296; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 

MRRDPAPGFSMLLFGVS LACYS P S LKS VQDQAYKAPVWEGKVQGLVPAGGS SSNSTREP 60 

I 1 I I I I I I I I I I I I I I I I I 1 I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I 
MRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPWVEGKVQGLAPAGGS SSNSTREP 168 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 

I M I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II II I 
PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEA7\AGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I 
PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 28 8 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I || I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I : I I I M 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 348 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

I I I M I II I I I II I I I I I I I I I I II I I I I I I I I I M M M I I I I I I I I I I I I I I I I I I I I 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 408 



I I 



Qy 


l 


Db 


109 


Qy 


61 


Db 


169 


Qy 


121 


Db 


229 


Qy 


181 


Db 


289 


Qy 


241 


Db 


349 


Qy 


301 


Db 


409 



RESULT 6 
AAU11636 



ID AAU11636 standard; protein; 298 AA. 
XX 

AC AAU11636; 
XX 

DT 12-MAR-2002 (first entry) • 
XX 

DE Human Neuregulin-2beta, NRG-2beta. 
XX 

KW Human; neuregulin-2; NRG-2alpha; NRG-2beta; mitogenesis; cell survival; 

KW cell growth; cell differentiation; erbB receptor; cardiomyopathy; 

KW ischaemic damage; cardiac trauma; heart failure; atherosclerosis; 

KW vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens. 
XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2001. 
XX 

PF 23-MAY-2001; 2001WO-US016896 . 
XX 

PR 23-MAY-2000; 2000US-02064 95P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR N-PSDB; AAS18020. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating multiple 

PT sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's disease, by 

PT increasing mitogenesis, survival, growth or differentiation of a cell. 
XX 

PS Claim 53; Fig 9; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human NRG-2alpha 

CC or NRG-2beta (clone 2b7) and the polynucleotides encoding the. Also 

CC included are a vector expressing the protein, a host cell comprising the 

CC vector, a transgenic non-human animal transformed with the vector or 

CC having a knockout mutation in one or both NRG-2 alleles and an anti-NRG-2 

CC antibody. Analysis of mutations in NRG-2 in an individual is useful for 

CC diagnosing an increased likelihood of developing a NRG-2-related disease 

CC or condition in a test subject. NRG-2 is useful for increasing the 

CC mitogenesis, survival, growth or differentiation of a cell (e.g. a 

CC neuronal cell), where the cell expresses an erbB receptor. NRG-2 is 

CC useful for treating diseases and disorders such as cardiomyopathy 

CC (preferably degenerative congenital disease), ischaemic damage, cardiac 

CC trauma or heart failure or which has a condition affecting smooth muscle 



CC which include atherosclerosis, vascular lesion, vascular hypertension, 

CC and degenerative congenital vascular disease, myasthenia gravis, a 

CC neurodegenerative disorder, peripheral neuropathy, a sensory nerve fiber 

CC neuropathy, a motor fiber and a sensory nerve fiber neuropathy, multiple 

CC sclerosis, amyotrophic lateral sclerosis, spinal muscular atrophy, nerve 

CC injury, Alzheimer's disease, Parkinson's disease, cerebellar ataxia, and 

CC spinal cord injury. The antibody is useful for treatment of a tumour 

CC comprising inhibiting proliferation of a tumour cell preferably a glial 

CC tumour cell, for treating of neurofibromatosis by inhibiting glial cell 

CC mitogenesis . The present sequence represents NRG-2beta 
XX 

SQ Sequence 298 AA; 

Query Match 86.0%; Score 1505; DB 5; Length 298; 

Best Local Similarity 98.6%; Pred. No. 1.8e-92; 

Matches 285; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVYVEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I I I I I I II I I II I I I I I I I II I I I I I I I i I I I I II I I I I I I I I I I I I I I I 
Db ■ 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 18 0 

I I I I I I I I I I II II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I 
Db 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I II 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 

I I I I I I I I II I M II M M I II I I I I II I I I I I I I I I I I I I I : I M 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 



RESULT 7 
AAW27536 

ID AAW27536 standard; protein; 754 AA. 
XX 

AC AAW27536; 
XX 

DT 18-DEC-1997 (first entry) 
XX 

DE Rat cerebellum derived growth factor 1 . 
XX 

KW Rat; cerebellum derived growth factor; CDGF1; screening; binding; 

KW modulation; erbB type receptor; identification; indication; risk; 

KW proliferation; differentiation; induction; neuron; hyperplasia; 

KW stem cell culture; intracerebral graft; alleviation; repair; 

KW behavioural defect; nervous system; central; peripheral; nerve; 

KW prothesis; damage; entubulation; cell survival; treatment; injury; 

KW trauma; ischaemia; ischemia; stroke; infection; disorder; inflammation; 

KW neurodegeneration; disease; Parkinson's; Huntingdon's; 

KW amylotrophic lateral sclerosis; sensory; retina; 



KW spinocerebellar degeneration; multiple sclerosis; neoplasia; 

KW amalignant glioma; medulloblastoma; neuroectodermal tumour. 
XX 

OS Rattus rattus. 
XX 

FH Key Location/Qualifiers 

FT Peptide 1. .23 

FT /label= sig__peptide 

FT Peptide 24. .754 

FT /label= mat_peptide 

FT Region 55 

FT /note= "potential N-glycosylation site" 

FT Domain 158. .22 8 

FT /label= immmunoglobulin_like_domain 

FT Region 186 

FT /note= "potential N-glycosylation site" 

FT Domain 252. .297 

FT /label= epidermal_growth_f actor_like_domain 

FT Region 253 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 254 

FT /note= "potential N-glycosylation site" 

FT Region 2 61 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 267 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 278 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 2 80 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 289 

FT /note= "characteristic cysteine of epidermal growth 

FT factor like domain" 

FT Region 2 96 

FT /note= "potential N-glycosylation site" 

FT Cleavage-site 314. .315 

FT /label= potential_proteolytic_jsite 

FT Domain 316. .338 

FT /label= putative_transmembrane_domain 
XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997. 
XX 

PF 09-SEP-1996; 96WO-US0144 84 . 
XX 

PR 08-SEP-1995; 95US-0 052 58 64 . 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAMD S STANFORD. 

XX 

PI Chang H; 



DR WPI; 1997-192900/17. 

DR N-PSDB; AAT87922. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the treatment 

PT of neuronal injury and proliferative disorders. 

XX 

PS Claim 1; Page 63-66; 94pp; English. 
XX 

CC The present sequence is rat cerebellum derived growth factor 1 (CDGF1) , 

CC which can be used to screen for modulators of CDGF binding to erbB type 

CC receptors. Identification of a modification or mutation in a CDGF gene, 

CC or aberrant expression of a CDGF gene or levels of soluble CDGF may be 

CC used to indicate the risk of unwanted cell proliferation or 

CC differentiation. CDGF may be used to induce neuronal differentiation in 

CC stem cell culture, and maintain the integrity of a terminally 

CC differentiated neuronal cell culture, e.g. useful for intracerebral 

CC grafting to alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially where 

CC a crushed or severed axon is entubulated by a prosthetic. CDGF may also 

CC be used to enhance neuronal cell survival in the central or peripheral 

CC nervous system, to treat neurological conditions associated with nervous 

CC system injury, e.g. traumatic, chemical or vasal injury and deficits such 

CC as ischaemia resulting from stroke, infectious/inflammatory and tumour 

CC induced injury, chronic neurodegenerative disease including Parkinson's 

CC and Huntingdon 1 s , amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the central 

CC nervous system, e.g. amalignant gliomas, medulloblastomas and 

CC neuroectodermal tumours 

XX 

SQ Sequence 754 AA; 

Query Match 84.5%; Score 1478; DB 2; Length 754; 

Best Local Similarity 96.2%; Pred. No. 3.3e-90; 

Matches 278; Conservative 5; Mismatches 6; Indels 0; Gaps 0 

Qy 1 MRRDPAPGFSMLLFGVS IACYS PSLKSVQDQAYKAPVWEGKVQGLVPAGGS S SN STREP 60 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I II II I I I I I I II I I! I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I II I : I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I II I I 
Db 121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDI RI KYGNGRKNS RLQFNKVKVTCDAGEYVCEAENI LGKDTVRGRLYVNS VS 240 

I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I II 
Db 181 DGKELNRSRDI RI KYGNGRKNS RLQFNKVKVEDAGEYVCEAENI LGKDTVRGRLHVNS VS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 

I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I : I II 



Db 



241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 2 89 



RESULT 8 
AAW48380 

ID AAW48380 standard; protein; 181 AA. 
XX 

AC AAW48380; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Mus musculus don-1 polypeptide. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell 

KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 

KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus 

KW wound healing; transmembrane. 
XX 

OS Mus musculus. 
XX 

FH Key Location/Qualifiers 

FT Domain 104. .14 0 

FT /note= "EGF domain" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US0145 85 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 

PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 

DR N-PSDB; AAV17813. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 25; Fig 2; 121pp; English. 
XX 

CC The sequence is that encoded by a murine don-1 gene splice variant. Don- 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 181 AA; 



Query Match 54.9%; Score 960; DB 2; Length 181; 

Best Local Similarity 97.8%; Pred. No. 2.2e-56; 

Matches 177; Conservative 3; Mismatches 1; Indels 0; Gaps 



0; 



Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I I I I I I I I I I I I i I I I : II I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 



Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 329 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 180 



QY 


330 N 330 


Db 


1 

181 N 181 


RESULT 9 


ABG71637 




AR(^71fi^7 cil-anH^rH • nrni-pi n ■ 1fil AA 

/■VDVj / X O O / ol_ClIlUci.LU, pLULCXll/ -L u J. nn . 


yy 

VS. A. 




AC 


ABG71637 ; 


XX 




DT 


14-JAN-2003 (first entrv) 


XX 




DE 


Murine secreted splice variant of Don-1. 


XX 




KW 


Murine; Don-1; epidermal growth factor; EGF; neuregulin; mouse; 


KW 


glycoprotein ligand; cell proliferation; cell proliferative disorder; 


KW 


carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 


KW 


cell survival; epithelial cell; wound healing; tumour formation; brain; 


KW 


vulnerary; cytostatic. 


XX 




OS 


Mus sp. 


XX 




PN 


US2002127594-A1. 


XX 




PD 


12-SEP-2002. 


XX 




PF 


12-MAR-2002; 2002US-00096241 . 


XX 




PR 


22-JUN-2000; 2000US-00599789 . 


XX 




PA 


(GEAR/) GEARING DP. 


PA 


(BUSF/) BUSFIELD S J. 


XX 




PI 


Gearing DP, Busfield SJ; 


XX 




DR 


WPI; 2003-039584/03. 


DR 


N-PSDB; ABS56034. 


XX 




PT 


Novel Don-1 polypeptide useful for stimulating proliferation of cells, 


PT 


for identifying proteins that interact with Don-1, and for regulating 



PT tumor formation and progression in brain. 
XX 

PS Claim 25; Fig 2; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands. Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents murine secreted 

CC splice variant of Don-1 

XX 

SQ Sequence 181 AA; 



Query Match 54.9%; Score 960; DB 6; Length 181; 

Best Local Similarity 97.8%; Pred. No. 2.2e-56; 

Matches 177; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I : I I 1 I I I I I I I I II I I I I I I I 1 I I I I I I I 1 I I I I I I I I I I I I I I I I I I M I I I 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I I I I I I I I I I M I I I I : I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 



Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 329 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I I II II I 1 I I I I I I I I I I I I I M I I M I I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 180 

Qy 330 N 330 

I 

Db 181 N 181 



RESULT 10 
ABG71639 

ID ABG71639 standard; protein; 469 AA. 
XX 

AC ABG71639; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE Human second splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 



KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 

KW vulnerary; cytostatic. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 14 

FT /note= "Encoded by AA" 
XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-00096241 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUS FIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR N-PSDB; ABS56036. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 25; Fig 4; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins. Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents human second 

CC splice variant of Don-1 

XX 

SQ Sequence 469 AA; 

Query Match 50.4%; Score 881; DB 6; Length 469; 
Best Local Similarity 100.0%; Pred. No. 1.2e-50; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 



Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I I I I I > I I I I I I I I I M M I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



AAW48383; 

17-AUG-1998 (first entry) 
Homo sapiens don-1 polypeptide. 

Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 
breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 
wound healing; transmembrane. 



RESULT 11 
AAW48383 

ID AAW48383 standard; protein; 647 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
PT 



Homo sapiens . 
Key 

Domain 
Domain 
Domain 
Domain 

WO9807736-A1. 
26-FEB-1998. 

18- AUG-1997; 

19- AUG-1996; 
19-NOV-1996; 



Location/Qualifiers 
54. .108 

/note= "Ig domain" 
142. .178 

/note= "EGF domain" 
203. .225 

/note= "transmembrane domain" 
226. .647 

/note= "cytoplasmic domain" 



97WO-US014585. 

96US-00699591. 
96US-00753007, 



(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 

Gearing DP, Busfield SJ; 

WPI; 1998-169084/15. 
N-PSDB; AAV17816. 

Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 
and adenocarcinoma (s ) , and for wound healing. 



XX 

PS Claim 25; Fig 7; 121pp; English. 
XX 

CC The sequence is that encoded by a human don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 64 7 AA; 

Query Match 50.4%; Score 881; DB 2; Length 647; 

Best Local Similarity 100.0%; Pred. No. 1.7e-50; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVWEDAGEYVCEAENILGKDTVRGRLYWSVSTTLSSWSGHARKCNETAKSYC 2 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I I I I I I 1 I I I I 1 II I I I I I I I I I I II I I I I I I I I II 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 12 


ABG71644 


ID 


ABG71644 standard; protein; 647 AA. 


XX 




AC 


ABG71644; 


XX 




DT 


14-JAN-2003 (first entry) 


XX 




DE 


Human third splice variant of Don-1. 


XX 




KW 


Human; Don-1; epidermal growth factor; EGF; neuregulin; 


KW 


glycoprotein ligand; cell proliferation; cell proliferative disorder; 


KW 


carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 


KW 


cell survival; epithelial cell; wound healing; tumour formation; brain; 


KW 


vulnerary; cytostatic . 


XX 




OS 


Homo sapiens. 


XX 




FH 


Key Location/Qualifiers 


FT 


Misc-dif f erence 14 


FT 


/note= "Encoded by AA" 


FT 


Misc-dif f erence 310 


FT 


/note= "Encoded by AGC" 


XX 




PN 


US2002127594-A1. 



XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0009624 1 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR N-PSDB; ABS56045. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 25; Fig 7; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands . Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents human third 

CC splice variant of Don-1 

XX 

SQ Sequence 647 AA; 

Query Match 50.4%; Score 881; DB 6; Length 647; 

Best Local Similarity 100.0%; Pred. No. 1.7e-50; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 2 01 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEWCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 2 61 

I I I M I I I II It I I I I I I I I I I I I I I II II M I I I I I I I I I II I I I I I I I I I I 

Db 91 SRLQFNKVIO/EDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 



Qy 262 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I II I I I II II II I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 13 
AAW48382 

ID AAW48382 standard; protein; 469 AA. 
XX 

AC AAW48382; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 polypeptide. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; epithelial cell; 
KW proliferation; stimulation; treatment; tumours; skin; oesophagus; lung; 
KW breast; liver; pancreas; colon; prostate; gastrointestinal tract; uterus; 
KW wound healing; transmembrane. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Domain 54. .108 

FT /note= M Ig domain" 

FT Domain 142. .178 

FT /note= "EGF domain" 

FT Domain 203. .225 

FT /note= "transmembrane domain" 

FT Domain 226. .4 69 

FT /note= "cytoplasmic domain" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US014585 . 
XX 

PR 19-AUG-1996; 96US-00699591 . 
PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 
DR N-PSDB; AAV17815. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 25; Fig 4; 121pp; English. 
XX 

CC The sequence is that encoded by a human don-1 gene splice variant. Don-1 
CC polypeptides stimulate proliferation of epithelial cells and thus are 
CC implicated in melanomas and adenocarcinomas in which epithelial cells 
CC proliferate out of control. Compounds that interfere with don-1 mediated 
CC cell proliferation can be used in the treatment of tumours such as 
CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 
CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 
CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 



CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 469 AA; 



Query Match 50.0%; Score 875; DB 2; Length 469; 

Best Local Similarity 99.4%; Pred. No. 3e-50; 

Matches 162; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I i I I I I I I I I I I I I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENI LGKDTVRGRLYVNSVSTTLS SWSGHARKCNETAKS YC 261 

I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I i I 
Db 91 SRLQFNKV1<VEDAGEWCEAENILGKDTVRGRLYWSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFAQRCLEKLPLRLYMPDPKQ 193 



RESULT 14 




AAW48381 




t n 


AAW48381 standard; protein; 407 AA. 


yy 








AAW48381; 




XX 






DT 


17-AUG-1998 (first entry) 


YY 
AA 






DE 


Homo sapiens don-1 polypeptide. 


XX 






KW 


Murine; don-1 


gene; melanoma; treatment; adeno* 


KW 


proliferation; 


stimulation; treatment; tumours 


KW 


breast; liver; 


pancreas; colon; prostate; gast. 


KW 


wound healing; 


transmembrane . 


XX 






OS 


Homo sapiens. 




XX 






FH 


Key 


Location/ Qualifiers 


FT 


Domain 


16. .70 


FT 




/note= "Ig domain" 


FT 


Domain 


104. .140 


FT 




/note= "EGF domain" 


FT 


Region 


157. .164 


FT 




/note= " juxtamembrane region" 


FT 


Domain 


173. .195 


FT 




/note= "transmembrane domain" 


FT 


Domain 


196. .407 


FT 




/note= "cytoplasmic domain" 


XX 






PN 


WO9807736-A1. 




XX 






PD 


26-FEB-1998. 




XX 






PF 


18-AUG-1997; 


97WO-US014585. 


XX 






PR 


19-AUG-1996; 


96US-00699591. 



PR 19-NOV-1996; 96US-00753007 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 1998-169084/15. 

DR N-PSDB; AAV17814. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of melanomas 

PT and adenocarcinoma ( s ) , and for wound healing. 

XX 

PS Claim 25; Fig 3; 121pp; English. 
XX 

CC The sequence is that encoded by a human don-1 gene splice variant. Don-1 

CC polypeptides stimulate proliferation of epithelial cells and thus are 

CC implicated in melanomas and adenocarcinomas in which epithelial cells 

CC proliferate out of control. Compounds that interfere with don-1 mediated 

CC cell proliferation can be used in the treatment of tumours such as 

CC melanomas and adenocarcinomas of the skin, oesophagus, lung, breast, 

CC liver, pancreas, gastrointestinal tract, colon, prostate or uterus. 

CC Alternatively, don-1 polypeptides can be used to stimulate epithelial 

CC cell proliferation, e.g. for wound healing 
XX 

SQ Sequence 407 AA; 

Query Match 48.1%; Score 842; DB 2; Length 407; 

Best Local Similarity 98.7%; Pred. No. 4e-48; 

Matches 156; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

QY 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I M II I I I I I I I I I I I I I I I | | | | | | I M I I I I I I I I M I I I I I | | | | | | | | M M 
Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 2 69 

M I I I I I I I I I I I I I I II I II I I I I I II I I I I I I I M I II I M I I I I I I I II I I I I | I | | 
Db 61 KVEDAGEWCEAENILGKDTV^GRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

QY 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVL 307 

I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKHL 158 



RESULT 15 
ABG71638 

ID ABG71638 standard; protein; 407 AA. 
XX 

AC ABG71638; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE Human membrane-bound splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; brain; 



KW vulnerary; cytostatic. 
XX 

OS Homo sapiens. 
XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2 002US-000 9624 1 . 
XX 

PR 22-JUN-2000; 2000US-00599789 . 
XX 

PA (GEAR/) GEARING D P. 
PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR N-PSDB; ABS56035. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumor formation and progression in brain. 
XX 

PS Claim 25; Fig 3; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene called Don 

CC -1, and alternate splice variants of Don-1, which are related to 

CC epidermal growth factors (EGF) such as neuregulins . Don-1 polypeptides 

CC are glycoprotein ligands. Both murine and human Don-1 sequences are 

CC cloned. The mouse Don-1 gene maps to chromosome 18. Don-1 polypeptides 

CC are useful for stimulating proliferation of a cell. Antibodies to Don-1 

CC polypeptides are useful for detecting Don-1 in a sample. The Don-1 

CC polypeptides are useful for treating and diagnosing cell proliferative 

CC disorders and play a role in the proliferation of carcinomas e.g. 

CC adenocarcinoma, myeloma, in cell differentiation, proliferation and 

CC survival. The polypeptides are also useful for inhibiting proliferation 

CC of adenocarcinoma cells, for stimulating the proliferation of cells such 

CC as epithelial cells to promote wound healing, for identifying proteins 

CC that interact with Don-1, and for regulating tumour formation and 

CC progression in the brain. The polynucleotide sequences encoding Don-1 may 

CC be used in gene therapy. The present sequence represents human membrane- 

CC bound splice variant of Don-1 

XX 

SQ Sequence 407 AA; 

Query Match 48.1%; Score 842; DB 6; Length 407; 

Best Local Similarity 98.7%; Pred. No. 4e-48; 

Matches 156; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

QY 1^0 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

HI II Ml I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I 

Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNFCV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 2 69 

I I I N I I I I I I I I I M I I I I I I | | | M II I 1 I I I I I I I I I I I I M | | | I | | | | | | | | | | | 
Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 



Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVL 307 

I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKHL 158 

Search completed: August 17, 2004, 14:10:48 
Job time : 53.5478 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: August 17, 2004, 14:09:05 ; Search time 16.2898 Seconds 

(without alignments) 
1045.842 Million cell updates/sec 

Title: US-09-8 64-675-2 

Perfect score: 1749 

Sequence: 1 MRRDPAPGFSMLLFGVSLAC PGTGVSSSQWSTSPSTLDLN 330 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 389414 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/2/iaa/5A_COMB.pep : * 

2: /cgn2_6/ptodata/2/iaa/5B_COMB.pep:* 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep:* 

4: /cgn2_6/ptodata/2/iaa/6B_COMB.pep:* 

5: /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep:* 

6 : /cgn2_6/ptodata/2/iaa/backf ilesl . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


1722 


98. 


5 


330 


2 


US- 


08- 


525- 


8 64A-4 


Sequence 


4, Appli 


2 


1478 


84. 


5 


754 


2 


US- 


08- 


525- 


864A-2 


Sequence 


2, Appli 


3 


960 


54, 


9 


181 


3 


us- 


08- 


753- 


007A-4 


Sequence 


4, Appli 


4 


960 


54 


9 


181 
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ALIGNMENTS 



RESULT 1 

US-08-525-864A-4 

; Sequence 4, Application US/08525864A 

; Patent No. 5912326 

; GENERAL INFORMATION: 

; APPLICANT : Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
; TITLE OF INVENTION: Related thereto 

NUMBER OF SEQUENCES: 18 

CORRESPONDENCE ADDRESS : 
; ADDRESSEE : LAHIVE & COCKFIELD 

; STREET: 28 State Street 

CITY: Boston 
; STATE: Massachusetts 

COUNTRY: USA 
; ZIP : 02109 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 



; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 

FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
; NAME: Kara, Catherine J. 

REGISTRATION NUMBER: 41,106 

REFERENCE/DOCKET NUMBER: HUI-017 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (617)227-7400 

TELEFAX: (617)742-4214 
INFORMATION FOR SEQ ID NO: 4: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 330 amino acids 

; TYPE: amino acid 

; TOPOLOGY: linear 

; MOLECULE TYPE: protein 
US-08-525-864A-4 

Query Match 98.5%; Score 1722; DB 2; Length 330; 

Best Local Similarity 97.9%; Pred. No. 1.2e-147; 

Matches 323; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

1 MRRDPAPGFSMLLFGVS LACYS PSLKSVQDQAYKAPWVEGKVQGLVPAGGS S SNSTREP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
1 MRRD PAPGFSMLLFGVS LAC YSPSLKSVQDQAYKAPVVVEGKVQGLAPAGGSS SNSTREP 60 

61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I I 1 I II I I I I II 1 I I I 1 I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I 
61 PASGRVALVKVXDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I I I I : I I I I I I I I I I I I I I I I I I I I I II I I I : II I I I I I I I M I I I I I I I I I I I I I 
121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I I I I I I II I II I I I I I I I I I I II I I I I I I I I I I I I I I M I M I I I I I I I : I I I I I 
181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 240 

241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

301 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



RESULT 2 

US-08-525-864A-2 

; Sequence 2, Application US/08525864A 

; Patent No. 5912326 

; GENERAL INFORMATION: 

; APPLICANT: Chang, Han 



TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES : 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 8 64A 
FILING DATE: 8-SEP-1995 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
REFERENCE/DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: ( 617 ) 227-74 00 
TELEFAX: (617) 742-4214 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 754 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-525-864A-2 

Query Match 84.5%; Score 1478; DB 2; Length 754; 

Best Local Similarity 96.2%; Pred. No. 4.3e-125; 

Matches 278; Conservative 5; Mismatches 6; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSIACYS PS LKSVQDQAYKAPVVVEGKVQGLVPAGGS S SNSTREP 60 

1 I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I 1 I I I I I I I I I II I I I I I I I I I I 
Db 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 60 

Qy 61 PASGRV7VLVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I II I I I I I I I I I I 1 I I I II I I I I I I M I I I I I i I I I II I II I I I I I II I I I I I I I I 
Db 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 120 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

I : I I II I : I I I I 1 I I I I I I I I I I M I M I I I I I I : I M I I I I I I I I I I I I I I I I I I I M 
Db 121 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 180 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I I I I I I 1 I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I : I I I I I 
Db 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 240 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I : I M 
Db 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 



RESULT 3 

US-08-753-007A-4 

; Sequence 4, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 
; STATE: MA 

COUNTRY: US 
; ZIP : 02110-2804 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-1996 
; CLASSIFICATION: 536 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 
; REGISTRATION NUMBER: 32,983 

; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX : 

; INFORMATION FOR SEQ ID NO: 4: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 181 amino acids 

; TYPE: amino acid 

; STRANDEDNESS : not relevant 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 
; FRAGMENT TYPE: internal 
US-08-753-007A-4 

Query Match 54.9%; Score 960; DB 3; Length 181; 

Best Local Similarity 97.8%; Pred. No. 4.2e-79; 

Matches 177; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I : M I I I I I I I I i I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 1 I I I 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 



Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: M II I I I I II I J. I I I I I II I I II I : I I I I I I II I I I I I II I I II I I I I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLPiVNSVSTTLSSWSGRARKCNETAKSYCVNGGVCYY 120 



QY 



270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 32 9 




Db 



121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 18 0 



Qy 



330 N 330 



Db 



181 N 181 



RESULT 4 
US-09-398-496-4 

; Sequence 4, Application US/09398496 
; Patent No. 6133423 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/398,496 

; FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
; APPLICATION NUMBER: 08/699,591 

; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
TELEX: 

; INFORMATION FOR SEQ ID NO: 4: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 181 amino acids 

; TYPE: amino acid 

STRANDEDNESS: not relevant 



TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-4 

Query Match 54.9%; Score 960; DB 3; Length 181; 

Best Local Similarity 97.8%; Pred. No. 4.2e-79; 

Matches 177; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I : I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: II I I I I II I I I I I I I I I I I I I I II : I I I I II I I I I II I I I I I I I I I I I I I I II I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 329 

I I I I I I I I I I I I I I II I I I I I II II I I I I I I I I I I I I I I I I I M I I II I I I I I I I II I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 180 

Qy 330 N 330 

I 

Db 181 N 181 



RESULT 5 

US-08-753-007A-8 

; Sequence 8, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS : 

; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/753 , 007A 

; FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 



REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX : 

INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 69 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-08-753-007A-8 

Query Match 50.4%; Score 881; DB 3; Length 469; 

Best Local Similarity 100.0%; Pred. No. 2.1e-71; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 2 61 

I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I I I II I I I I I I 1 II I I II I I I I I I II I I I I I I I II I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 6 
US-09-398-496-8 

; Sequence 8, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP : 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398,4 96 



FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8 906 
TELEX: 

INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 69 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-8 

Query Match 50.4%; Score 881; DB 3; Length 469; 

Best Local Similarity 100.0%; Pred. No. 2.1e-71; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I! I II I I I M 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVT(GRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 7 

US-08-753-007A-32 

; Sequence 32, Application US/08753007A 
; Patent No. 6074841 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 
; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

; NUMBER OF SEQUENCES: 33 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE : MA 



COUNTRY: US 
ZIP : 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 32: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 647 amino acids 
TYPE: amino acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-08-753-007A-32 

Query Match 50.4%; Score 881; DB 3; Length 647; 

Best Local Similarity 100.0%; Pred. No. 3.3e-71; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 2 02 SRLQFNKVKVEDAGEYVCEAENI LGKDTVRGRLYVNSVSTTLS SWSGHARKCNETAKS YC 2 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I 1 I I I I I I I I I 
Db 91 S RLQFNKVKVEDAGEYVCEAENI LGKDTVRGRLYVNSVSTTLS SWS GHARKCNETAKS YC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 8 

US-09-398-496-32 

; Sequence 32, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; APPLICANT: Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 



; TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

; STATE : MA 

COUNTRY: US 
ZIP : 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398 , 4 96 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
; APPLICATION NUMBER: 08/699,591 

; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: 

INFORMATION FOR SEQ ID NO: 32: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 64 7 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-32 



Query Match 50.4%; Score 881; DB 3; Length 647; 

Best Local Similarity 100.0%; Pred. No. 3.3e-71; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

I I I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

Qy 2 02 SRLQFNKWVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 91 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 



Qy 2 62 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I II I I I I I 
Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 9 

US-08-753-007A-6 

; Sequence 6, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION : 

; APPLICANT: Gearing, David P. 

; APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 
; STATE : MA 

; COUNTRY: US 

ZIP : 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-199 6 
; CLASSIFICATION: 536 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 
; REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 

TELEX: 

; . INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 07 amino acids 
; TYPE : amino acid 

STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-08-753-007A-6 

Query Match 48.1%; Score 842; DB 3; Length 407; 

Best Local Similarity 98.7%; Pred. No. 5.8e-68; 

Matches 156; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II II I I I I I I 
Db 1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 



Qy 



210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 



Db 61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVL 307 

I I I I II I I I I I I I I I M I 1 I I I I I I I I I I I I II I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKHL 158 



RESULT 10 
US-09-398-496-6 

; Sequence 6, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398 , 496 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-1996 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

TELEFAX : 617-542-8906 

TELEX : 

INFORMATION FOR SEQ ID NO: 6: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 4 07 amino acids 
; TYPE: amino acid 

; STRANDEDNESS: not relevant 

; TOPOLOGY: linear 

; MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 



Query Match 48.1%; Score 842; DB 3; Length 407; 

Best Local Similarity 98.7%; Pred. No. 5.8e-68; 

Matches 156; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 



150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 




Db 



1 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 



Qy 



210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 




Db 



61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 



270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVL 307 




Db 



121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKHL 158 



RESULT 11 
US-08-753-007A-2 

; Sequence 2, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Bus field, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE : Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
; REGISTRATION NUMBER: 32,983 

; REFERENCE/DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX : 617-542-8906 
TELEX: 

; INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 605 amino acids 



TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-08-753-007A-2 

Query Match 46.9%; Score 821; DB 3; Length 605; 

Best Local Similarity 97.4%; Pred. No. 7.9e-66; 

Matches 151; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

I I I I I I : I I I I I i I I I I I I I I I I I I II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I 

Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

: I I I I I I I I I I I II I I I I I I I I I I I : I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 12 0 

Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 155 



RESULT 12 
US-09-398-496-2 

; Sequence 2 f Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; APPLICANT: Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

; CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 9/398 , 4 9 6 

FILING DATE: 
; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-1996 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 



REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8 906 
. TELEX: 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 605 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-09-398-496-2 

Query Match 46.9%; Score 821; DB 3; Length 605; 

Best Local Similarity 97.4%; Pred. No. 7.9e-66; 

Matches 151; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 150 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

M I I I I : I I I I I I I I I I I I I I I I I I I M I i I I I I I I I I I I I I M I I I I I I I I I I I | I I | 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

= I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 61 RVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 120 

Qy 270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 155 



RESULT 13 
US-08-753-007A-33 

; Sequence 33, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

; TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
; CITY: Boston 

STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 



FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX : 

INFORMATION FOR SEQ ID NO: 33: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 139 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-753-007A-33 

Query Match 42.8%; Score 748; DB 3; Length 139; 

Best Local Similarity 98.6%; Pred. No. 4.3e-60; 

Matches 137; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

QY 192 RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHAR 251 

I I M I I I I I I I I I I I I I I : I I 1 I I I I I | | | | M | I I I I II I II : I I I I I I I M I I M I I I 
Db 1 RIKYGNGRKNSRLQFNKVRVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHAR 60 

QY 252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 311 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I | | | | | | | || 
Db 61 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 12 0 

Qy 312 GTGVSSSQWSTSPSTLDLN 330 

I II I I I I I I I I I I I I I II I 
Db 121 GTGVSSSQWSTSPSTLDLN 139 



RESULT 14 
US-09-398-496-33 

; Sequence 33, Application US/09398496 
; Patent No. 6133423 

GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398,4 96 

FILING DATE: 
; CLASSIFICATION: 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-199 6 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-19 96 

ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
; TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR SEQ ID NO: 33: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 139 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-09-398-496-33 



Query Match 42.8%; Score 748; DB 3; Length 139; 

Best Local Similarity 98.6%; Pred. No. 4.3e-60; 

Matches 137; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

QY 192 RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHAR 251 

1 I I ' I I I I : I I I I I , I I ! I I I 1 I I ; I ; I I I I ; , : | , | ; , ! ||ll 

Db 1 ^IKYGNGRKNSRLQFNKVTlVEDAGEWCEAENILGKDTv^GRLHWSVSTTLSSWSGHAR 60 

QY 252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 311 

llll Hill t I I I I 1 I I ( I I I 1 I | I f i | | | | I | | | f | I 1 | Mill 

Db 61 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 120 

Qy 312 GTGVSSSQWSTSPSTLDLN 330 

M I I I I I I I II I I M | | | | 
Db 121 GTGVSSSQWSTSPSTLDLN 139 



RESULT 15 
US-08-525-864A-6 

; Sequence 6, Application US/08525864A 

; Patent No. 5912326 

; GENERAL INFORMATION: 

; APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 

TITLE OF INVENTION: Related thereto 

NUMBER OF SEQUENCES: 18 

CORRESPONDENCE ADDRESS: 



; ADDRESSEE: LAHIVE & COCKFIELD 

; STREET: 28 State Street 

CITY: Boston 
STATE: Massachusetts 
; COUNTRY: USA 

ZIP: 02109 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 
; FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
ATTORNEY/ AGENT INFORMATION: 
NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
; REFERENCE/DOCKET NUMBER: HUI-017 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: ( 617 ) 227-74 00 
TELEFAX: (617)742-4214 
; INFORMATION FOR SEQ ID NO: 6: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 131 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
; MOLECULE TYPE: protein 
US-08-525-864A-6 



Query Match 40.4%; Score 707; DB 2; Length 131; 

Best Local Similarity 99.2%; Pred. No. 2e-56; 

Matches 130; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 
QY 200 KNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKS 259 

>>> 1 1 ■ 1 1 1 1 ' 1 1 1 1 1 1 1 r 1 1 1 e r 1 1 1 1 1 1 1 1 1 1 1 = 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 

Db 1 KNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKS 60 

260 YCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQ 319 

UN i I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I M I II M II I I I I I I II 

Db 61 YCWGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQ 12 0 

Qy 320 WSTSPSTLDLN 330 

I I I I I I I I I I I 
Db 121 WSTSPSTLDLN 131 



Search completed: August 17, 2004, 14:14:01 
Job time : 17.2898 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: August 17, 2004, 14:06:50 ; Search time 15.2389 Seconds 

(without alignments) 
2083.044 Million cell updates/sec 



Title: US-09-864-675-2 
Perfect score: 1749 

Sequence: 1 MRRDPAPGFSMLLFGVSLAC PGTGVSSSQWSTSPSTLDLN 330 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 283366 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



Database 



PIR_7 8 : * 
pirl : * 
pir2: * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
JC5700 

ErbB kinase activator alpha, brain and thymus - human 
C; Species: Homo sapiens (man) 

C;Date: 25-Nov-1997 #sequence_revision 25-Nov-1997 #text_change 08-Sep-2002 
C; Access ion: JC5700 

R;Higashiyama, S.; Horikawa, M. ; Yamada, K. ; Ichino, N. ; Nakano, N.; Nakagawa, 
T.; Miyagawa, J.; Matsushita, N . ; Nagatsu, T. ; Taniguchi, N. ; Ishiguro, H. 
J. Biochem. 122, 675-680, 1997 

A;Title: A novel brain-derived member of the epidermal growth factor family that 
interacts with ErbB3 and ErbB4 . 

A; Reference number: JC5700; MUID: 98006324 ; PMID: 9348101 
A; Accession: JC5700 

A; Status: nucleic acid sequence not shown 
A;Molecule type: mRNA 
A; Residues: 1-850 <HIG> 

A; Cross-references: DDBJ : AB005060 ; NID : g2626738 ; PIDN : BAA234 17 . 1 ; PID:g2626739 
A; Experimental source: SK-NSH cell 

C; Comment: This protein is a member of the epidermal growth factor family. It is 
functionally similar to neurogulin in terms of directly activating ErbB4, 



transactivating ErbBl, B2 and B3, and stimulating the differentiation of MDA-MB- 
453 ceils. 

C;Superfamily: human ErbB kinase activator alpha, brain and thymus; EGF 
homology; immunoglobulin homology 
C; Keywords: glycoprotein 

F;258-311/Domain: Ig-like #status predicted <IGL> 
F;345-381/Domain: EGF homology <EGF> 

F;346-381/Domain: EGF-like #status predicted <EGF2> 

F; 147,278, 451/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 92.1%; Score 1610; DB 2; Length 850; 

Best Local Similarity 100.0%; Pred. No. l.le-114; 

Matches 304; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Q y 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I M M M I I I II I I I M I I I I I I I M I I I 1 I I I I I I I INI 

Db 93 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

|| | | | | | I I I I I I I I I I I I I I I I I I I 1 I I M I I I I I I I I 

Db 153 PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

| | | I I I I I I I I M I I I I I I II I M I I I I I I II II I I I I I I I I I I 

Db 213 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 272 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKWVEDAGEYVCEAENILGKDTVRGRLYVNSVS 24 0 

| | | | | | I I I I I I I I I M M I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 2 73 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

|| | | | | | | | | I M II I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I 
Db 333 TTLSSWSGHARKCNETAKSYCVNGGVCYY1EGINQLSCKCPNGFFGQRCLEKLPLRLYMP 392 

Qy 301 DPKQ 304 

I I II 

Db 393 DPKQ 396 



RESULT 2 
JC5701 

ErbB kinase activator alphal, brain and thymus - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 25-Nov-1997 #sequence_revision 25-Nov-1997 #text_change 08-Sep-2002 
C;Accession: JC5701; PC4411 

R;Higashiyama, S.; Horikawa, M. ; Yamada, K. ; Ichino, N. ; Nakano, N. ; Nakagawa, 
T . ; Miyagawa, J.; Matsushita, N . ; Nagatsu, T.; Taniguchi, N . ; Ishiguro, H. 
J. Biochem. 122, 675-680, 1997 

A; Title: A novel brain-derived member of the epidermal growth factor family that 
interacts with ErbB3 and ErbB4 . 

A;Reference number: JC5700; MUID : 98006324 ; PMID:9348101 
A; Accession: JC5701 
A;Molecule type: mRNA 
A; Residues: 1-868 <HIG> 

A;Cross-references: DDBJ:D89995; NID: g2605629; PIDN : BAA23344 . 1 ; PID:g2605630 

A; Accession: PC4411 

A; Molecule type: protein 



A; Residues: 128-162 <HI2> 

A; Experimental source: PC-12 cell 

C; Comment: This protein is a member of the epidermal growth factor family. It is 
functionally similar to neurogulin in terms of directly activating ErbB4, 
transactivating ErbBl, B2 and B3, and stimulating the differentiation of MDA-MB- 
453 cells. 

C; Superf amily: human ErbB kinase activator alpha, brain and thymus; EGF 
homology; immunoglobulin homology 
F;361-397/Domain: EGF homology <EGF> 

Query Match 90.1%; Score 1576; DB 2; Length 868; 

Best Local Similarity 96.7%; Pred. No. 4.4e-112; 

Matches 297; Conservative 4; Mismatches 6; Indels 0; Gaps 0; 

Qy 1 MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

I I I I I I I I I I II 1 I I I I I I I I I I I I M I I I I I I I I I I M I I I M I I I I M II I I I I I I 
Db 109 MRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGSSSNSTREP 168 

Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I | | I M I I I I I II I I I I M I I II I I II I I I I I I II I I I I I I I I 1 M I I I I I I I I I I I I I 
Db 169 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

Qy 121 PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 

|:| | I I I : I I I I I II I II I I II I I M I I I I I I I I : I I I I I I I I M II I I I I II I I I I I I 
Db 229 PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 288 

Qy 181 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I 
Db 289 DGKELNRS RDI RI KYGNGRKN S RLQFNKVKVEDAGE YVCEAEN I LGKDTVRGRLHVNSVS 348 

Qy 241 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

I I I I I || I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I M M II I I I I M I I I I I I 
Db 349 TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 408 

Qy 301 DPKQSVL 307 

I I I I I 

Db 409 DPKQKHL 415 



RESULT 3 
JC5702 

ErbB kinase activator alpha2a, brain and thymus - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 25-Nov-1997 #sequence_revision 25-Nov-1997 #text_change 08-Sep-2002 
C;Accession: JC5702; PC4417 

R;Higashiyama, S.; Horikawa, M. ; Yamada, K. ; Ichino, N. ; Nakano, N. ; Nakagawa, 
T.; Miyagawa, J.; Matsushita, N.; Nagatsu, T.; Taniguchi, N . ; Ishiguro, H. 
J. Biochem. 122, 675-680, 1997 

A; Title: A novel brain-derived member of the epidermal growth factor family that 
interacts with ErbB3 and ErbB4. 

A;Reference number: JC5700; MUID : 98006324 ; PMID:9348101 
A; Accession: JC57 02 

A; Status: nucleic acid sequence not shown 
A; Molecule type: mRNA 
A; Residues: 1-860 <HIG> 

A; Cross-references: DDBJ:D89996; NID : g2605631 ; PIDN : BAA23345 . 1 ; PID:g2605632 
A; Experimental source: PC-12 cell 



A;Accession: PC4417 

A; Status : nucleic acid sequence not shown 
A;Molecule type: mRNA 

A;Residues: ' F' , 212-213, 223-860 <HI2> 

A;Cross-references: DDBJ : AB001576 ; NID : g2 60547 8 ; PIDN : BAA2334 8 . 1 ; PID:g2605479 
A; Experimental source: PC-12 cell 

C; Comment: This protein is a member of the epidermal growth factor family. It is 
functionally similar to neurogulin in terms of directly activating ErbB4, 
transactivating ErbBl, B2 and B3, and stimulating the differentiation of MDA-MB- 
453 cells. 

C; Super family : human ErbB kinase activator alpha, brain and thymus; EGF 
homology; immunoglobulin homology 
C; Keywords: glycoprotein 

F;274-327/Domain: Ig-like ftstatus predicted <IGL> 
F;361-397/Domain: EGF homology <EGF> 

F;422-444/Domain: hydrophobic ((status predicted <HYD> 

F;163,294,467/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 90.1%; Score 1575; DB 2; Length 860; 

Best Local Similarity 97.4%; Pred. No. 5.2e-112; 

Matches 296; Conservative 4; Mismatches 4; Indels 0; Gaps 0; 

MRRDPAPGFSMLL FGVS LAC YS P S LKS VQDQAYKAPVWEGKVQGLVPAGGS S SNS TREP 60 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I 

MRRDPAPGS SMLLFGVS LACYS PSLKSVQDQAYKAPVVVEGKVQGLAPAGGSS SNSTREP 168 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I M I M I I I I I I I I I I I 
PASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 228 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 18 0 

I : I | | | I : I I I I I I I I I I I I I I I I I II M I I I I I : II II I I I I I I I I I I I I I I I I M I I 
PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 288 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 240 

|| I I 1 I I II I I I II I I I I I I I I I II I I II I I I I I I I I M i I I I I I I I I I I I I I I : I M I I 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 34 8 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

I | I I I I I I I I I I I I I I I I I II I II I I I I I II I I I I I I M I I I I I I I I M I I I I I I I I I I I 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 408 



I I I 



Qy 


1 


Db 


109 


Qy 


61 


Db 


169 


Qy 


121 


Db 


229 


Qy 


181 


Db 


289 


Qy 


241 


Db 


349 


Qy 


301 


Db 


409 



RESULT 4 
S32357 

glial growth factor - human 
C; Species: Homo sapiens (man) 

C;Date: 02-Dec-1993 #sequence_revision 10-Nov-1995 #text_change 08-Sep-2002 
C;Accession: S32357 

R;Marchionni, M.A. ; Goodearl, A.D.J.; Chen, M.S.; Bermingham-McDonogh, O.; Kirk, 
C; Hendricks, M. ; Danehy, F. ; Misumi, D. ; Sudhalter, J.; Kobayashi, K. ; 
Wroblewski, D. ; Lynch, C. ; Baldassare, M. ; Hiles, I.; Davis, J.B.; Hsuan, J. J.; 



Totty, N.F.; Otsu, M. ; McBurney, R.N.; Waterfield, M.D.; Stroobant, P.; Gwynne, 
D. 

Nature 362, 312-318, 1993 

A; Title: Glial growth factors are alternatively spliced erbB2 ligands expressed 
in the nervous system. 

A; Reference number: S32357; MUID : 93205115 ; PMID: 8096067 
A;Accession: S32357 
A; Status : preliminary 
A; Molecule type: mRNA 
A; Residues: 1-422 <MAR> 

A; Cross-references: GB:L12260; NID:g292047; PIDN : AAB59622 . 1 ; PID:g292048 
C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 
F;363-402/Domain: EGF homology <EGF> 

Query Match 29.9%; Score 523; DB 2; Length 422; 

Best Local Similarity 35.5%; Pred. No. 2.9e-32; 

Matches 124; Conservative 59; Mismatches 88; Indels 78; Gaps 13; 

GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS — NSTRE 59 

11:111 I I : I I I : I : I M : I I I I I I : I I : M 

GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

PPASGRVA LVKVLDKWPLRSGGLQ 83 

|||:|| I I I I I : : : I I I : 

PPAAGPRALGPPAEEPLIAANGTVPSWPTAPVPSAGEPGEEAPYLVKVliQWAVKAGGLK 17 6 

REQVISV GSCVPLERNQRYI FFLEP TEQPLVFKTAFAPLDTNGKN 12 8 

: : : : : I II I : : I I I I I : I I : I I : : I I I : I I : I 

KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRN 235 

LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 18 8 

I I I I I : : I I I I I : I I : I I I I II I : I I : : : : I I I : I I I I I 

LKKEVSRVLCKRCALPPQLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 2 95 

RD I RI K YGN GRKN S RLQ FN KVKVE D AGE YVC EAEN I LGKDT VRGRL YVN SVSTTLSS 245 

: : I : I : I : I I : I I : I : I I I : I : : I I I : : : : I : I 

NKPQNIKIQKKPGK — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTS 353 

WSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPNGFFGQRC 289 

: I I III I : : I I I I I I : : : : : I I I I I I I I II 
TTGTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 4 02 



Qy 


15 


Db 


58 


Qy 


60 


Db 


117 


Qy 


84 


Db 


177 


Qy 


129 


Db 


236 


Qy 


189 


Db 


296 


Qy 


246 


Db 


354 



RESULT 5 
138404 

neu differentiation factor - human 
C; Species: Homo sapiens (man) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text__change 08-Sep-2002 
C;Accession: 138404 

R;Wen, D. ; Suggs, S.V.; Karunagaran, D.; Liu, N. ; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N. ; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A;Reference number: A56210; MUID : 94158 8 63 ; PMID:7509448 
A;Accession: 138404 

A; Status: preliminary; translated from GB/EMBL/DDBJ 



A;Molecule type: mRNA 
A; Residues: 1-4 62 <RES> 

A;Cross-references: EMBL:U02326; NID:g408402; PIDN : AAA19951 . 1 ; PID:g408403 
C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 

Query Match 16.8%; Score 293.5; DB 2; Length 462; 

Best Local Similarity 32.1%; Pred. No. 9.6e-15; 

Matches 71; Conservative 37; Mismatches 62; Indels 51; Gaps 7; 

Qy 126 GKNLKKEVGKI LCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II I I I I : I I : I I : I I I I II I : I I : : : : I 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

Qy 17 9 FKDGKELNRS RDI RI KYGNGRKNSRLQFNKVKVEDAGE YVCEAENI LGKDTVRGRL- 234 

I I : I I I II : : I : I : I : I I : I I : I : I I I : I : : I I I : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK — SELRINKASLADSGEYMCKVI SKLGNDSASANIT 12 8 

Qy 235 YV NSVSTTLSSWSG — HARKCNETAKS 259 

II I : I : I : I : I I I I I I : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

Qy 260 YCVNGGVCYYIEGINQLS CKCPNGFFGQRCLEKLPLRL 297 

: I I I I I I : : : : : I III I I I I I I : I : : : 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCQPGFTGARCTENVPMKV 229 



RESULT 6 
A43273 

heregulin precursor, splice form alpha - human 

N;Alternate names: breast cancer cell differentiation factor p45; Neu 

differentiation factor 

C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence__revision 31-Dec-1993 #text_change 08-Sep-2002 
C;Accession: A43273; A48498; A38155 

R;Holmes, W.E.; Sliwkowski, M.X.; Akita, R.W. ; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D.; Abadi, N. ; Raab, H. ; Lewis, G.D.; Shepard, H.M.; Kuang, W.J.; 
Wood, W.I.; Goeddel, D.V.; Vandlen, R.L. 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID: 92271253; PMID: 1350381 
A; Accession: A43273 

A; Status: nucleic acid sequence not shown; not compared with conceptual 
translation 
A;Molecule type: mRNA 
A; Residues: 1-640 <HOL> 

A; Experimental source: breast tumor cell line, MDA-MB-231, ATCC HTB 26 
A;Note: sequence extracted from NCBI backbone (NCBIP : 103250 ) 
R;Culouscou, J.M. ; Plowman, G.D.; Carlton, G.W.; Green, J.M.; Shoyab, M. 
J. Biol. Chem. 268, 18407-18410, 1993 

A; Title: Characterization of a breast cancer cell differentiation factor that 
specifically activates the HER4/pl8 0 (erbB4 ) receptor. 
A; Reference number: A48498; MUID : 93366731 ; PMID:7689552 
A; Accession: A48498 
A;Molecule type: protein 

A;Residues: 20-21, ' X ' , 23-24 , * XX ' , 27-28 <CUL> 

R;Peles, E . ; Bacus, S.S.; Koski, R.A. ; Lu, H.S.; Wen, D. ; Ogden, S.G.; Levy, 
R.B.; Yarden, Y. 



Cell 69, 205-216, 1992 

A; Title: Isolation of the neu/HER-2 stimulatory ligand: a 44 kd glycoprotein 

that induces differentiation of mammary tumor cells. 

A; Reference number: A38155; MUID: 922 08945 ; PMID: 1348215 

A;Accession: A38155 

A;Molecule type: protein 

A; Residues: 'X ',15-16, 'X», 18-20, 'RG', 23-24, 1 GP ' , 27 , ! E ' , 29 , 'XP f ,32-36 <PEL> 

A;Note: sequence extracted from NCBI backbone (NCBIP : 91347 ) 

C; Genetics : 

A; Gene: GDB : HGL 

A;Cross-references : GDB: 132656; OMIM: 142445 
A; Map position: 8p22-8pll 

C; Superf amily : human heregulin; EGF homology; immunoglobulin homology 
C; Keywords: alternative splicing; glycoprotein 
F;182-221/Domain: EGF homology <EGF> 

Query Match 16.7%; Score 292.5; DB 2; Length 640; 

Best Local Similarity 32.1%; Pred. No. 1.7e-14; 

Matches 71; Conservative 37; Mismatches 62; Indels 51; Gaps 7 

Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II I I I I : I I : I I : I I I I II I : I I : : : : I 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPQLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

Qy 179 FKDGKELNRS RDI RI KYGNGRKNSRLQFNKVKVEDAGE YVCEAENI LGKDTVRGRL- 234 

I I : I I I I I : : I : I : I : I I : I I : I : I I I : I : : I I I : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK — SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG — HARKCNETAKS 259 

I I | : | : | : | : | I Ml | : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

Qy 2 60 YCVNGGVCYYIEGINQLS CKCPNGFFGQRCLEKLPLRL 2 97 

: I I I I I I : :: :: I III I I I I I I : I ::: 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCQPGFTGARCTENVPMKV 229 



RESULT 7 
161719 

neu differentiation factor - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text_change 08-Sep-2002 
C;Accession: 161719; 161723; 161716; 161717; 161724; A38220 
R;Wen, D.; Suggs, S.V. ; Karunagaran, D. ; Liu, N,; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A;Reference number: A56210; MUID : 94158863; PMID:7509448 
A;Accession: 161719 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-639 <RES> 

A;Cross-references: EMBL:U02319; NID:g408388; PIDN: AAA19944 . 1; PID:g408389 
A;Accession: 161723 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 



A; Residues: 1-639 <RE2> 

A; Cross-references : EMBL:U02323; NID:g408396; PIDN: AAA19948 . 1; PID:g408397 
A;Accession: 161716 

A; Status: preliminary; translated from GB/EMBL/DDB J 
A; Molecule type: mRNA 

A;Residues: 1-422, 'H', ' NL 1 , 637-638 , ? ELRRNKAYRSKCMQIQLSATHLRPSSITHLGFIL ' <RE3> 
A; Cross-references: EMBL:U02316; NID:g408382; PIDN : AAA19941 . 1 ; PID:g408383 
A; Accession: 161717 

A; Status: preliminary; translated from GB/EMBL/DDB J 
A;Molecule type: mRNA 

A; Residues: 1-422, *H ! , ■ NL 1 , 637-638 , ' ELRRNKAYRSKCMQIQLSATHLRPSSITHLGFIL ' <RE4> 
A; Cross-references : EMBL:U02317; NID:g408384; PIDN :AAA1 9942 . 1 ; PID:g408385 
A; Accession : 161724 

A; Status: preliminary; translated from GB/EMBL/DDB J 
A;Molecule type: mRNA 
A; Residues: 1-422 <RE5> 

A; Cross-references : EMBL:U02324; NID:g408398; PIDN: AAA19949 . 1 ; PID:g408399 
R;Wen, D.; Peles, E.; Cupples, R. ; Suggs, S.V. ; Bacus, S.S.; Luo, Y. ; Trail, G. ; 
Hu, S.; Silbiger, S.M.; Levy, R.B.; Koski, R.A. ; Lu, H.S.; Yarden, Y. 
Cell 69, 559-572, 1992 

A;Title: Neu differentiation factor: a transmembrane glycoprotein containing an 

EGF domain and an immunoglobulin homology unit. 

A; Reference number: A38220; MUID: 92257596; PMID: 1349853 

A;Accession: A38220 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-422 <WEN> 

A;Note: sequence extracted f rom NCBI backbone (NCBIN: 101767, NCBIP : 101768 ) 
C; Super family : human heregulin; EGF homology; immunoglobulin homology 

Query Match 16.4%; Score 286; DB 2; Length 639; 

Best Local Similarity 32.8%; Pred. No. 5.3e-14; 

Matches 65; Conservative 35; Mismatches 54; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I I I I II I : I I : : : : II I : I I I II : | : | : | 

Db 34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 

Qy 199 RKNS RLQ FNKVKVEDAGEYVCEAENI LGKDTVRGRL YV 236 

: I I : I I : I : I I I : I : : I I I : : I I 

Db 94 K — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 151 

Qy 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYI EGINQLS CK 279 

I : I : I : I : I I I I I I : : II I I I I : : : : : I II 
Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 

Qy 280 CPNGFFGQRCLEKLPLRL 2 97 

I I I I I I I : I : : : 
Db 212 CQPGFTGARCTENVPMKV 22 9 



RESULT 8 
D43273 

heregulin precursor, splice form beta-3 - human 

N;Alternate names: glial growth factor HRG-beta-3; neuregulin 

C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 08-Sep-2002 



C;Accession: D43273; S32358 

R;Holmes, W.E.; Sliwkowski, ■ M.X. ; Akita, R.W.; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D.; Abadi, N.; Raab, H.; Lewis, G.D.; Shepard, H.M.; Kuang, W.J.; 
Wood, W.I.; Goeddel, D.V. ; Vandlen, R.L. 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID : 92271253 ; PMID: 1350381 
A; Access ion: D43273 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-241 <HOL> 

R;Marchionni, M.A. ; Goodearl, A.D.J. ; Chen, M.S.; Bermingham-McDonogh, O.; Kirk, 
C; Hendricks, M. ; Danehy, F. ; Misumi, D. ; Sudhalter, J.; Kobayashi, K. ; 
Wroblewski, D.; Lynch, C; Baldassare, M. ; Hiles, I.; Davis, J.B.; Hsuan, J.J. ; 
Totty, N.F.; Otsu, M. ; McBurney, R.N.; Waterfield, M.D.; Stroobant, P.; Gwynne, 

Nature 362, 312-318, 1993 

A;Title: Glial growth factors are alternatively spliced erbB2 ligands expressed 
in the nervous system. 

A; Reference number: S32357; MUID: 93205115; PMID: 8096067 
A; Accession: S32358 
A;Molecule type: mRNA 
A; Residues: 1-241 <MAR> 

A/Cross-references: GB:L12261; NID:g292049; PIDN : AAB59358 . 1 ; PID:g292050 

C; Genetics : 

A; Gene: GDB : HGL ; GGF 

A/Cross-references : GDB: 132656; OMIM: 142445 
A; Map position: 8p22-8pll 

C;Superfamily: human heregulin; EGF homology; immunoglobulin homology 

C; Keywords: alternative splicing 

F; 182-221/Domain: EGF homology <EGF> 

Query Match 16.3%; Score 285.5; DB 2; Length 241; 

Best Local Similarity 32.9%; Pred. No. 1.8e-14; 

Matches 70; Conservative 33; Mismatches 59; Indels 51; Gaps 7; 

QY 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II I I I I : I I : | | : M I I II I : I I : : : : | 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

179 FKDGKELNRS— RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 
I I : I I I I I : : I : I : | : | | : I I : I : I I I : i : : | | | : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK — SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG — HARKCN ETAKS 259 

II I : I : I : I : I I I I I I : 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

QY 260 YCVNGGVCYYIEGINQLS — CKCPNGFFGQRC 289 

HUM I: :: :: I Mill I I II 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 221 



RESULT 9 
C43273 

heregulin precursor, splice form beta-2 - human 



C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text__change 08-Sep-2002 
C;Accession: C43273; 138407 

R;Holmes, W.E.; Sliwkowski, M.X. ; Akita, R.W.; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D.; Abadi, N. ; Raab, H. ; Lewis, G.D.; Shepard, H.M. ; Kuang, W. 
Wood, W.I.; Goeddel, D.V.; Vandlen, R.L. 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID : 92271253 ; PMID: 1350381 
A;Accession: C43273 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-637 <HOL> 

R;Wen, D. ; Suggs, S.V.; Karunagaran, D.; Liu, N. ; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A;Reference number: A56210; MUID : 94158863 ; PMID:7509448 
A; Accession: 138407 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A; Molecule type: mRNA 

A; Residues: 119-406 <RES> 

A;Cross-references: EMBL:U02329; NID:g408408; PIDN : AAA19954 . 1 ; PID:g408409 

C; Genetics : 

A; Gene: GDB : HGL 

A; Cross-references : GDB: 132656; OMIM: 142445 
A; Map position: 8p22-8pll 

C;Superfamily: human heregulin; EGF homology; immunoglobulin homology 

C;Keywords: alternative splicing 

F; 182-221/Domain: EGF homology <EGF> 

Query Match 16.3%; Score 285.5; DB 2; Length 637; 

Best Local Similarity 32.9%; Pred. No. 5.7e-14; 

Matches 70; Conservative 33; Mismatches 59; Indels 51; Gaps 7 

^ 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

II MM : I I I |:|| :: ::| 

Db 11 GKGKKKERGSGKKPESAAGSQSPALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 70 

2V 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

I I : I I I I I :: | : | : | : | | : | | : | : | | | : | : : | | | : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANIT 128 

Qy 235 YV NSVSTTLSSWSG — HARKCNETAKS 259 

" I: Ml :| :| I M I I: 

Db 129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

QY 2 60 YCWGGVCYYI EGINQLS CKCPNGFFGQRC 289 

: M I I I I : : : : : I I I II 

189 FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 221 



RESULT 10 
B43273 

heregulin, splice form beta 1 - human 



C; Species: Homo sapiens (man) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 08-Sep-2002 
C;Accession: B43273; 138406 

R;Holmes, W.E.; Sliwkowski, M.X.; Akita, R.W. ; Henzel, W.J.; Lee, J.; Park, 
J.W.; Yansura, D. ; Abadi, N.; Raab, H. ; Lewis, G.D.; Shepard, H.M.; Kuang, W.l 
Wood, W.I.; Goeddel, D.V.; Vandlen, R.L. 
Science 256, 1205-1210, 1992 

A;Title: Identification of heregulin, a specific activator of pl85(erbB2). 
A; Reference number: A43273; MUID : 92271253 ; PMID: 1350381 
A;Accession: B43273 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-645 <HOL> 

R;Wen, D.; Suggs, S.V. ; Karunagaran, D.; Liu, N.; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N . ; Trollinger, D.B.; Jacobsen, V.L.; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A;Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors . 

A;Reference number: A56210; MUID: 94158863 ; PMID:7509448 
A; Accession: 138406 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 

A;Residues: f A', 95-418, ! F ! , 420-645 <RES> 

A;Cross-references: EMBL:U02328; NID:g408406; PIDN : AAA19953 . 1 ; PID:g408407 

C; Genetics : 

A; Gene: GDB : HGL 

A; Cross-references: GDB: 132656; OMIM: 14244 5 
A; Map position: 8p22-8pll 

C;Superfamily: human heregulin; EGF homology; immunoglobulin homology 

C; Keywords: alternative splicing 

F; 182-221/Domain: EGF homology <EGF> 

Query Match 16.3%; Score 285.5; DB 2; Length 645; 

Best Local Similarity 32.9%; Pred. No. 5.8e-14; 

Matches 70; Conservative 33; Mismatches 59; Indels 51; Gaps 7 

Qy 126 GKNLKKEVGKILCTDCAT RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 178 

M I I I I : I I : I I : II I I II I : I I : : : : | 

Db 11 GKGKKKERGSGKKPESAAGSQSP7VLPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKW 7 0 

QY 179 FKDGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL- 234 

11:11111 : : I : I : I : I I : I I : I : I I I : I : : I I I : : 

Db 71 FKNGNELNRKNKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANIT 12 8 

Qy 235 YV NSVSTTLSSWSG — HARKCNETAKS 259 

M I : I : I : I : I I I I I I : 

129 IVESNEIITGMPASTEGAYVSSESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKT 188 

Qy 260 YCVNGGVCYYI EGINQLS CKCPNGFFGQRC 289 

: I M M I : : : : : I I I I I I I I I I 
Db 189 FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 221 

RESULT 11 
A45769 

acetylcholine receptor synthesis stimulator ARIA-1 precursor - chicken 



C; Species: Gallus gallus (chicken) 

C;Date: 20-Feb-1995 #sequence__revision 20-Feb-1995 #text_change 08-Sep-2002 
C; Accession: A457 69 

R; Falls, D.L.; Rosen, K.M.; Corfas, G. ; Lane, W.S.; Fischbach, G.D- 
Cell 72, 801-815, 1993 

A;Title: ARIA, a protein that stimulates acetylcholine receptor synthesis, is 
member of the neu ligand family. 

A; Reference number: A45769; MUID : 932 01602 ; PMID: 8453670 
A;Accession: A45769 
A; Status: preliminary 
A;Molecule type: mRNA; protein 
A; Residues: 1-602 <FAL> 

A; Cross-references : GB:L11264; NID: g212603; PIDN : AAA49037 . 1 ; PID:g212604 
A; Experimental source: brain 

A;Note: sequence extracted from NCBI backbone (NCBIN : 127787 , NCBIP : 12778 8 ) 
C; Superfamily: human heregulin; EGF homology; immunoglobulin homology 

Query Match 16.3%; Score 284.5; DB 2; Length 602; 

Best Local Similarity 32.8%; Pred. No. 6.4e-14; 

Matches 62; Conservative 34; Mismatches 72; Indels 21; Gaps 5 

Qy 109 TEQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAA 168 

: I I I : I II : I I I I : I I : I II : I I : I I 

Db 5 S EGP LQYS LAPTQTDVNS SYNTVPPKLKEMKNQEVAVGQKLVLRCETT 52 

QY 169 AGNPQPSYRWFKDGKEL NRS RDI RI KYGNGRKNS RLQFNKVKVEDAGE YVCEAEMI L 225 

J I : : I I : I I I : II : : : I : , I I : : Mill I : I 

Db 53 SEYPALRFKWLKNGKEITKKNRPENVKIP-KKQKKYSELHIYRATLADAGEYACRVSSKL 111 

Qy 226 GKDTVRGRLYVNSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGI NQLSCKC 280 

II— : = : I : I : | | | | : I : : I I I I I I I : : : : | : | 

Db 112 GNDSTKASVIITDTNATSTSTTGTSHLTKCDIKQKAFCVNGGECYMVKDLPNPPRYLCRC 171 

Qy 281 PNGFFGQRC 289 

II I I II 
Db 172 PNEFTGDRC 18 0 



RESULT 12 
161722 

neu differentiation factor - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-May-1998 #sequence_revision 29-May-1998 #text_change 08-Sep-2002 
C; Accession: 161722 

R;Wen, D.; Suggs, S.V. ; Karunagaran, D. ; Liu, N.; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N. ; Trollinger, D.B.; Jacobsen, V.L. ; Meng, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A;Reference number: A56210; MUID: 94158863 ; PMID:7509448 
A; Access ion: 161722 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-662 <RES> 

A;Cross-references: EMBL:U02322; NID:g408394; PIDN: AAA19947 . 1 ; PID:g408395 
C; Superfamily: human heregulin; EGF homology; immunoglobulin homology 
F; 182-22 1/Domain: EGF homology <EGF> 



Query Match 16.2%; Score 283; DB 2; Length 662; 

Best Local Similarity 33.0%; Pred. No. 9.3e-14; 

Matches 66; Conservative 32; Mismatches 58; Indels 44; Gaps 6 

QY 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS — RDIRIKYGNG 198 

' I V I I : |:|| :: ::|||:| Mil :|:|: I 

Db 34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 

QV 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRL YV 236 

: I I : M : I : I I I : I : : I I I : : | | 

94 K--SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 151 

Q y 237 NSVSTTLSSWSG--HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I I Ml I : : | | | I I I : : : : : I | | 
Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 

QY 280 CPNGFFGQRCLEKLPLRLYM 299 

I II I I I I : M 
Db 212 CPNEFTGDRCQNYVMASFYM 231 



Db 



RESULT 13 
A56210 

neu differentiation factor - rat (fragment) 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 02-Jul-1996 #sequence_revision 02-Jul-1996 #text change 08-Sep-2002 
C;Accession: A56210 "~ 

R;Wen, D. ; Suggs, S.V.; Karunagaran, D . ; Liu, NT.; Cupples, R.L.; Luo, Y ; 
Janssen, A.M.; Ben-Baruch, N. ; Trollinger, D.B.; Jacobsen, V.L.; Menq, S. 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A; Reference number: A56210; MUID : 94 158863 ; PMID:7509448 
A;Accession: A56210 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-230 <RES> 

A; Cross-references: EMBL:U02315; NID:g408380; PIDN : AAA1994 0 . 1 ; PID:g408381 
C;Superfamily: human heregulin; EGF homology; immunoglobulin homology 

Query Match 15.9%; Score 278; DB 2; Length 230; 

Best Local Similarity 33.7%; Pred. No. 6.3e-14; 

Matches 64; Conservative 31; Mismatches 51; Indels 44; Gaps 



6; 



Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

' 1=11=1111 II |:|| :: ::|||:| |||| :| : | : | 

Db 23 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 82 

Qy 199 RKNSRLQFNKVKVEDAGEYVCEAENI LGKDTVRGRL YV 236 

= I I = I I : I : I I I : I : : I I I : : | | 

Db 83 K -- S ELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 140 

Qy 237 NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS— CK 279 

1=1 = 1=1=1 I M I I :: I I I I I | : : : : : | || 
Db 141 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 200 



Qy 28 0 CPNGFFGQRC 289 

I I I I I I I 
Db 2 01 CPNEFTGDRC 210 



RESULT 14 
161718 

neu differentiation factor - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-May-1998 #sequence__revision 29-May-1998 #text change 08-Sep-2002 
C;Accession: 161718; 161721; 161720 

R;Wen, D.; Suggs, S.V.; Karunagaran, D.; Liu, N . ; Cupples, R.L.; Luo, Y. ; 
Janssen, A.M.; Ben-Baruch, N.; Trollinger, D.B.; Jacobsen, V.L.; Meng, S 
Mol. Cell. Biol. 14, 1909-1919, 1994 

A; Title: Structural and functional aspects of the multiplicity of Neu 
differentiation factors. 

A; Reference number: A56210; MUID: 94158863; PMID: 7509448 
A; Accession: 161718 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-636 <RES> 

A; Cross-references: EMBL:U02318; NID:g408386; PIDN :AAA19943 . 1; PID:g408387 
A; Accession: 161721 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: mRNA 

A;Residues: 1-444 , 1 A' , 446-636 <RE2> 

A; Cross-references: EMBL:U02321; NID:g408392; PIDN: AAA19946 . 1; PID:g408393 
A; Accession: 161720 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 

A;Residues: 1-298 , 386, 'V , 388, ' TR r , 391 <RE3> 

A; Cross-references: EMBL:U02320; NID:g408390; PIDN: AAA19945 . 1; PID:g408391 
C; Super family: human heregulin; EGF homology; immunoglobulin homology 
F; 182-221/Domain : EGF homology <EGF> 

Query Match 15.9%; Score 278; DB 2; Length 636; 

Best Local Similarity 33.7%; Pred. No. 2.1e-13; 

Matches 64; Conservative 31; Mismatches 51; Indels 44; Gaps 6; 

'RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS — RDIRIKYGNG 198 
I'lhllll II |:|| :: ::|||:| Mil :|:|: | 



jDAGE Y VCEAENI LGKDTVRGRL YV 236 

I : I I I : I : : I I I : : | | 



Qy 


142 


Db 


34 


Qy 


199 


Db 


94 


Qy 


237 


Db 


152 


Qy 


280 


Db 


212 



-NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS CK 27 9 

I : I : I : I : I I III I : : I I I I I I : : : : : I II 



Ml I I II 



RESULT 15 



S32359 

glial growth factor - bovine 

C; Species: Bos primigenius taurus (cattle) 

C;Date: 19-Mar-1997 #sequence_revision Ol-Aug-1997 #text change 08-Sep-2002 
C;Accession: S32359 ~ 

R;Marchionni, M.A. ; Goodearl, A.D.J.; Chen, M.S.; Bermingham-McDonogh, O. ; Kirk, 
C; Hendricks, M. ; Danehy, F. ; Misumi, D . ; Sudhalter, J.; Kobayashi, K.; 
Wroblewski, D.; Lynch, C; Baldassare, M. ; Hiles, I.; Davis, J.B.; Hsuan, J. J.; 
Totty, N.F.; Otsu, M. ; McBurney, R.N.; Waterfield, M.D.; Stroobant, P.; Gwynne, 

Nature 362, 312-318, 1993 

A; Title: Glial growth factors are alternatively spliced erbB2 ligands expressed 
m the nervous system. 

A; Reference number: S32357; MUID : 932 05115 ; PMID: 8096067 
A;Accession: S32359 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-241 <MAR> 

A;Cross-references: GB:L12259; NID:g289413; PIDN:AAA30540 . 1; PID:g289414 
C;Superfamily: human heregulin; EGF homology; immunoglobulin homology 
F; 1 82-22 1/Domain : EGF homology <EGF> 

Query Match 15.5%; Score 271; DB 2; Length 241; 

Best Local Similarity 32.6%; Pred. No. 2.3e-13; 

Matches 62; Conservative 34; Mismatches 50; Indels 44; Gaps 6; 

ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKEL NRSRDIRIKYGNG 198 

1 1=11:1111 II |:|| :: ::|||:| II |: ::|:|: I 



Qy 


142 


Db 


34 


Qy 


199 


Db 


94 


Qy 


237 


Db 


152 


Qy 


280 


Db 


212 



.LGKDTVRGRL YV 236 

III:: || 



-NSVSTTLSSWSG— HARKCNETAKSYCVNGGVCYYIEGINQLS — CK 279 
1=1:1:1:1 I Ml I : : I M M I : : : : : | || 



I I I I I I I 



Search completed: August 17, 2004, 14:13:19 
Job time : 16.2389 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



August 17, 2004, 14:12:46 ; Search time 44.6656 Seconds 

(without alignments) 
2319.368 Million cell updates/sec 

US-09-864-675-2 
1749 

1 MRRDPAPGFSMLLFGVSLAC PGTGVSSSQWSTSPSTLDLN 330 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1292805 seqs, 313927144 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1292805 



Database 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



Published_Applications_AA: * 

/ cgn2_6/ptodata/ l/pubpaa/US07_PUBCOMB . pep : * 
/ cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 
/ cgn2_6/ptodata/ l/pubpaa/US06_NEW_PUB . pep : * 
/ cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB.pep: * 
/ cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep: * 
/ cgn2_6/ptodata/ l/pubpaa/PCTUS_PUBCOMB . pep : * 
/ cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep: * 
/ cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep: * 
/ cgn2_6/ptodata/ l/pubpaa/US09A_PUBCOMB . pep : * 
/cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep: 
/ cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep: 
/ cgn2_6/ptodata/l/pubpaa/US09_NEW_PUB.pep: * 
/ cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB,pep: 
/ cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep : 
/cgn2_6/ptodata/l/pubpaa/US10C_PUBCOMB.pep: 
/cgn2_6/ptodata/l/pubpaa/US10_NEW_PUB.pep: * 
/ cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: * 
/cgn2_6/ptodata/l/pubpaa/US60_PUBCOMB.pep: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result Query 

No. Score Match Length DB 



ID 



Description 



1 


1749 


100 . 0 


330 


y u o— u y— o o4- D / D~ z 


2 


1749 


100 . 0 


4 9 9 


10 Uo-_L(J-4 4 7-83 9A-3 


3 


1720 


98 . 3 


4 9 £ 

H Z D 


lo US-10-447-839A-2 


4 


1610 


92 . 1 


ft r n 


lb US-10-408-765A-610 


5 


1505 


86.0 


9 Q ft 


y US-09-864-675-4 


6 


960 


54 9 


1 ft 1 


13 US-10-096-241-4 


7 


881 


50 . 4 


A 


13 US-10-096-241-8 


8 


881 


50 . 4 


Dfl / 


13 US-10-096-241-32 


9 


842 


4 8.1 


Ann 


13 US-10-096-241-6 


10 


821 


4 6 Q 


DUO 


13 US-10-096-241-2 


11 


748 


42 . 8 


1 "3 Q 


13 US-10-096-241-33 


12 


523 


29.9 


fi Z Z 


o US-08-736-019-170 


13 


521 


9 9 ft 


/MO 


9 US-09-795-668-3 


14 


521 


29.8 


A T Q 
410 


9 US-09-795-686-3 


15 


521 


29.8 


41ft 
11(5 


y US-U9-946-807-3 


16 


354 


20.2 




y US-09-795-668-4 


17 


354 


20.2 


9H4 
Z U H 


y US-09-795-68 6-4 


18 


354 


?fi 9 


0 Ci A 


9 US-09-94 6-8 07-4 


19 


305 


17 A 


X Do 


9 US-09-795-668-5 


20 


305 


17 4 


loo 


9 US-09-795-686-5 


21 


305 


17 4 




9 US-09-946-807-5 


22 


304 


17 4 


cni 
DUl 


14 US-10-290-578-10 


23 


304 


17 4 


/DO 


9 US- 09-7 7 3-5 17-11 


24 


304 


17 4 


/bo 


9 US- 09-7 92 -02 5- 11 


25 


304 


17 4 
-i- / • fi 


/DO 


9 US- 09- 8 4 9-8 68-11 


26 


304 


17 4 


/DO 


9 US- 09- 8 08-602-85 


27 


304 


17 4 


7 ^ R 
/DO 


14 US-10-290-578-2 


28 


304 


17 4 


/Do 


14 US-10-453-183-11 


29 


296 


16.9 


1 Q9 


y US-09-795-668-2 


30 


296 


1 Q 


1 Q9 

i y z 


y US-09-795-686-2 


31 


296 


16.9 


1 no 
lJi 


y US-09-94 6-8 07-2 


32 


293.5 


16.8 


Zl £ 


y US-09-795-668-17 


33 


293.5 


1 ft 


j4 ^ £ 
ft 0 O 


y US-09-795-686-17 


34 


293.5 


1 6 ft 

J. O ■ O 


A R c 


9 US-09-946-807-17 


35 


293.5 


1 ft 


4 DZ 


16 US-10-408-765A-8 83 


36 


293.5 


16.8 




y US-09-795-668-16 


37 


293 . 5 


16.8 


DoZ 


y US-09-795-686-16 


38 


293.5 


16.8 


^ ^9 
DoZ 


y US-09-946-807-16 


39 


293.5 


16.8 


669 


Cj TTO_ f| Q_ *7 "7 Q CIT 1 

uo uy— / / o— 01/ — 1 


40 


293.5 


16. 8 


669 


9 US-09-792-025-1 


41 


293.5 


16. 8 


669 


9 US-09-849-868-1 


42 


293.5 


16. 8 


669 


14 US-10-022-609-11 


43 


293.5 


16.8 


669 


14 US-10-453-183-1 


44 


286 


16.4 


422 


13 US-10-096-241-9 


45 


285.5 


16.3 


239 


9 US-09-795-668-18 



Sequence 2, Appli 
Sequence 3, Appli 
Sequence 2, Appli 
Sequence 610, App 
Sequence 4, Appli 
Sequence 4, Appli 
Sequence 8, Appli 
Sequence 32, Appl 
Sequence 6, Appli 
Sequence 2, Appli 
Sequence 33, Appl 
Sequence 170, App 
Sequence 3, Appli 
Sequence 3, Appli 
Sequence 3, Appli 
Sequence 4, Appli 
Sequence 4, Appli 
Sequence 4, Appli 
Sequence 5, Appli 
Sequence 5, Appli 
Sequence 5, Appli 

Sequence 10, Appl 
Sequence 11, Appl 
Sequence 11, Appl 
Sequence 11, Appl 
Sequence 85, Appl 
Sequence 2, Appli 
Sequence 11, Appl 
Sequence 2, Appli 
Sequence 2, Appli 
Sequence 2, Appli 
Sequence 17, Appl 
Sequence 17, Appl 
Sequence 17, Appl 

Sequence 883, App 
Sequence 16, Appl 
Sequence 16, Appl 
Sequence 16, Appl 
Sequence 1, Appli 
Sequence 1, Appli 
Sequence 1, Appli 
Sequence 11, Appl 
Sequence 1, Appli 
Sequence 9, Appli 
Sequence 18, Appl 



ALIGNMENTS 



RESULT 1 
US-09-864-675-2 

; Sequence 2, Application US/09864675 

; Patent No. US2 0020081286A1 

; GENERAL INFORMATION: 

; APPLICANT: Marchionni, Mark 



; TITLE OF INVENTION: NRG- 2 NUCLEIC ACID MOLECULES, 

; TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 

; FILE REFERENCE: 04585/049002 

; CURRENT APPLICATION NUMBER: US/09/8 64 , 675 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/206,495 

; PRIOR FILING DATE: 2000-05-23 

; NUMBER OF SEQ ID NOS : 18 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 330 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-864-675-2 



Query Match 100.0%; Score 1749; DB 9; Length 330; 

Best Local Similarity 100.0%; Pred. No. 6.5e-135; 

Matches 330; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


l 


MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 

1 N M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | | M 1 1 1 1 1 1 1 1 M 1 1 I M | M II II 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 


60 


Db 


l 


60 


Qy 


61 


PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 

HI 1 1 1 M 1 1 1 II 1 II 1 1 1 | | | | | | | | | | 1 1 1 1 1 M 1 M 1 1 1 1 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 


120 


Db 


61 


120 


Qy 


121 


PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 
1 1 1 M 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | M M 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 
PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 


180 


Db 


121 


180 


Qy 


181 


DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 
1 1 1 1 1 N 1 1 1 M 1 1 1 1 1 M 1 1 II 1 1 II 1 I I I | | | | | | M | | | | | | | | || | | | M | || | || 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 


240 


Db 


181 


240 


Qy 


241 


TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 

N 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 II II II 1 M 1 1 1 1 1 1 1 1 1 I 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 


300 


Db 


241 


300 


Qy 


301 


DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 
1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 I I I | | 
DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 




Db 


301 





RESULT 2 

US-10-447-839A-3 

; Sequence 3, Application US/10447839A 

; Publication No. US20040018181A1 

; GENERAL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT: Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUC1 INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000.05.009 

; CURRENT APPLICATION NUMBER: US/10/447, 839A 
; CURRENT FILING DATE: 2003-05-29 



; PRIOR APPLICATION NUMBER: 10/293,391 

; PRIOR FILING DATE: 2002-11-13 

; PRIOR APPLICATION NUMBER: 09/951,938 

PRIOR FILING DATE: 2001-09-11 
; PRIOR APPLICATION NUMBER: 60/231,841 
; PRIOR FILING DATE: 2000-09-11 
; NUMBER OF SEQ ID NOS : 109 
; SOFTWARE: Patentln version 3.2 
; SEQ ID NO 3 

LENGTH: 422 

TYPE: PRT 
; ORGANISM: Homo Sapien 
US-10-447-839A-3 



Query Match 100.0%; Score 1749; DB 15; Length 422; 

Best Local Similarity 100.0%; Pred. No. 8.9e-135; 

Matches 330; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


1 


MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 
1 N 1 M 1 1 1 M 1 1 1 1 I I I || | | | | | | | | | | || | | | | | | || | | | | | | | | | 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 


60 


Db 


93 


152 


Qy 


61 


PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 
1 1 1 1 1 1 1 1 1 1 1 1 I I | | | | | | | | | | | | | | | || | | | || || | , | | | | | | 

PAS GRVALVKVLDKWPLRSGGLQREQVI SVG SCVPLERNQRYIFFLEPTEQPLVFKTAFA 


120 


Db 


153 


212 


Qy 


121 


PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 
1 1 1 M M 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 M 1 M 1 1 M 1 1 1 1 II 1 | 1 1 1 1 1 1 1 1 1 1 
PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 


180 


Db 


213 


272 


Qy 


181 


DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M II 1 1 1 1 II I I || 1 1 1 1 II II 1 1 1 II 1 I M I || 

DGKELNRSRDIRIKYGNGRKNSRLQFNKV1CVEDAGEYVCEAENILGKDTVRGRLYVNSVS 


240 


Db 


273 


332 


Qy 


241 


TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 

1 N 1 1 1 M 1 1 1 M 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 


300 


Db 


333 


392 


Qy 


301 


DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 330 
1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 || | | | | | | | || 
DPKQSVLWDTPGTGVSSSQWSTSPSTLDLN 422 




Db 


393 





RESULT 3 

US-10-447-839A-2 

; Sequence 2, Application US/10447839A 

; Publication No. US20040018181A1 

; GENERAL INFORMATION: 

; APPLICANT: Kufe, Donald W. 

; APPLICANT: Kharbanda, Surender 

; APPLICANT: Weitman, Steven D. 

; TITLE OF INVENTION: MUC1 INTERFERENCE RNA COMPOSITIONS AND METHODS DERIVED 
THEREFROM 

; FILE REFERENCE: 1000.05.009 

; CURRENT APPLICATION NUMBER: US/ 1 0/447 , 839A 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: 10/293,391 



PRIOR FILING DATE : 2002-11-13 
PRIOR APPLICATION NUMBER: 09/951,938 
PRIOR FILING DATE: 2001-09-11 
PRIOR APPLICATION NUMBER: 60/231,841 
PRIOR FILING DATE: 2000-09-11 
NUMBER OF SEQ ID NOS : 109 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 2 
LENGTH: 426 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-447-839A-2 

Query Match 98.3%; Score 1720; DB 15; Length 426; 

Best Local Similarity 100.0%; Pred. No. 2.1e-132; 

Matches 324; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPWVEGKVQGLVPAGGSSSNSTREP 60 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I 
MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPWAAEGKVQGLVPAGGSSSNSTREP 152 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I II I I I I II I I I I I I I I I I 
PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 212 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 18 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I i I I I I I M 
PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 2 72 

DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 24 0 
I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I II I I I I I I I I I I I 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 332 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 
I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II II I I II I I I I I I I I I I I I 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 392 

DPKQSVLWDTPGTGVSSSQWSTSP 324 
I I I I I I I I I I I I I I I I I I I I I I I I 
DPKQSVLWDTPGTGVSSSQWSTSP 416 



Qy 


l 


Db 


93 


Qy 


61 


Db 


153 


Qy 


121 


Db 


213 


Qy 


181 


Db 


273 


Qy 


241 


Db 


333 


Qy 


301 


Db 


393 



RESULT 4 

US-10-408-765A-610 

Sequence 610, Application US/10408765A 
Publication No. US20040101874A1 
GENERAL INFORMATION: 
APPLICANT: Ghosh, Soumitra S. 
APPLICANT: Fahy, Eoin D. 
APPLICANT: Zhang, Bing 
APPLICANT: Gibson, Bradford W. 
APPLICANT: Taylor, Steven W. 
APPLICANT: Glenn, Gary M. 
APPLICANT: Warnock, Dale E. 

TITLE OF INVENTION: TARGETS FOR THERAPEUTIC INTERVENTION 
TITLE OF INVENTION: IDENTIFIED IN THE MITOCHONDRIAL PROTEOME 
FILE REFERENCE: 660088.465 



; CURRENT APPLICATION NUMBER: US/10/408 765A 
; CURRENT FILING DATE: 2003-04-04 
; NUMBER OF SEQ ID NOS : 3077 

; SOFTWARE: FastSEQ for Windows Version 4 0 
; SEQ ID NO 610 

LENGTH: 850 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-408-765A-610 



Matches 304; Conservative 0; Mismatches 0;' Intel, 0; Gaps 0 



Qy 61 PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 1?n 

11; '''''''''' I I I I I I I I I I I I I | | | | | | 1 1 | | | | | 1 1 | || | j | | , | || n mTm ,tn 

PASGRVALVKVLDKWPLRSGGLQREQVISVGSCVPLEP^ 212 



Db 

Qy 121 

Db 213 pldtng™ ' liiiiiiii .LU.i!!ii!Jil A I I I I M I I I 1 1 I I 1 1 I I M I 



180 



PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKisiicEAAiGNpipSYRWFK 272 
181 ™™™* S ™^ 240 

273 DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRG^ 332 
241 ^ 



Qy 

Db 



Db 



' ' ' 1 1 1 " " H I I I I I I I I | | , , , , , , , , , , Mil MM Ml Ml I MM I Mi 

333 TTLSSWSGHARKCNETAKSYCWGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPL^ 392 



Qy 301 DPKQ 304 

I I II 

Db 393 DPKQ 396 

RESULT 5 
US-09-864-675-4 

Sequence 4, Application US/09864675 
Patent No. US20020081286A1 
GENERAL INFORMATION: 
APPLICANT: Marchionni, Mark 

TITLE OF INVENTION: NRG-2 NUCLEIC ACID MOLECULES 

™«s B sssrs:i.5/ t sss iiDB ' m ° diaghos ™ ™ d ™»™ ™ s 

CURRENT APPLICATION NUMBER: US/09/864 675 
CURRENT FILING DATE: 2001-05-23 
PRIOR APPLICATION NUMBER: US 60/206 4 95 
PRIOR FILING DATE: 2000-05-23 
NUMBER OF SEQ ID NOS: 18 

SEQ™o7" tSE2 Wind °" S V «""" 4 '° 

LENGTH: 298 
TYPE: PRT 



; ORGANISM: Homo sapiens 
US-09-864-675-4 



Query Match 86.0%; Score 1505; DB 9; Length 298; 

Best Local Similarity 98.6%; Pred. No. 5.5e-115; 

Matches 2 85; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 



Qy 


1 


MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 
1 1 1 1 1 i 1 1 1 M M 1 1 1 1 1 1 1 i 1 1 I 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I | M | 
MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSN STREP 


60 


Db 


1 


60 


Qy 


61 


PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 | | || 

FAbCzrRVALVKVLDKWPLRSGGLQREQVI SVGSCVPLERNQRYI FFLEPTEQPLVFKTAFA 


120 




^ i 


120 


Qy 

Db 


121 
121 


PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 
M 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 I | | | | | | | | | | | | | M | | | | | | | | | | || | | || 
PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 


180 
180 


Qy 


181 


DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 
M 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 I M | | | | | | | | | | | | | 
DGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVS 


240 


Db 


181 


240 


Qy 


241 


TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 
1 1 1 1 M 1 1 1 1 1 1 1 1 1 I I I I I I | I | | || | | | || | | | | | | | | | | : | || 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 289 




Db 


241 





RESULT 6 
US-10-096-241-4 

; Sequence 4, Application US/10096241 
; Publication No. US20020127594A1 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
; CITY: Boston 

STATE: MA 
COUNTRY: US 
; ZIP: 02110-2804 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/10/096,241 

FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 



ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE : 617-542-5070 
TELEFAX : 617-542-8906 
TELEX : <Unknown> 
; INFORMATION FOR SEQ ID NO: 4: 
; SEQUENCE CHARACTERISTICS: 

LENGTH : 181 amino acids 
TYPE: amino acid 
STRANDEDNESS : not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
SEQUENCE DESCRIPTION: SEQ ID NO: 4* 
US-10-096-241-4 



Query Match 54.9%; Score 960; DB 13; Length 181; 

Best Local Similarity 97.8%; Pred. No. 1.2e-70; 

Matches 177; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

QY 150 MKSQTGQVGEKQS LKCEAAAGNPQPS YRWFKDGKELNRSRDI RI KYGNGRKNS RLQFNKV 209 

IMI| |:|||||||IMIIIIMIIIIIIIMIIIIII Mill M I I I I I I ! If 

MKSQTGEVGEKQSLKCE7VAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNS RLQFNKV 



Db 1 
Qy 210 



60 



KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

^ : 1 1 1 1 I N I M I I I I I I I I I : I I I I I I I I I I I I I I I I I I I | | I I I I I Ml 

Db 61 RVEDAGEYVCEAENI LGKDTVRGRLHVNS VSTTLS SWS GHARKCNETAKS YCVNGGVCYY 120 



Qy 270 
Db 121 



IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 329 
I I I I I I I I I I I 1.1 I I I | | | | | || | || | | | | | | | || | M I II I I I I I I I 

IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLDL 180 



Qy 330 N 330 

I 

Db 181 N 181 



RESULT 7 
US-10-096-241-8 

; Sequence 8, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

Bus field, Samantha J. 
TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET : 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 



COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE : FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 2 4 1 
FILING DATE: 12-Mar-2002 
; CLASSIFICATION: <Unknown> 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 4 69 amino acids 
; TYPE: amino acid 

; STRANDEDNESS: not relevant 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
US-10-096-241-8 



Query Match 50.4%; Score 881; DB 13; Length 469; 

Best Local Similarity 100.0%; Pred. No. 1.2e-63; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 



0; 



Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 2 01 

'iliiiJJJJ.ili 1 1 1 11 Mi III III II IIIIMM I III III MM || IMM 

RIKYGNGRKN 90 



Db 31 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDI 



Qy 2 02 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 2 61 

1 I I I I I II I II II I I M I M I I I I II I II I M M I I I I I I I II I M || | M II I I I II II 

Db 91 S RLQFNKVKVEDAGE YVCEAEN I LGKDT VRGRL YVNS VSTTLS SWS GHARKCNETAKS YC 150 

Qy 2 62 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

N I II M II II I M II I II I I M I M I II I I II M 

Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 8 

US-10-096-241-32 

; Sequence 32, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 



NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE : Fish & Richardson P.C. 
STREET : 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
; ZIP: 02110-2804 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE : FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 10/096, 24 1 

FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
; REFERENCE/DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX : 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 32: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 64 7 amino acids 
; TYPE: amino acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
; SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

US-10-096-241-32 



Query Match 50.4%; Score 881; DB 13; Length 647; 

Best Local Similarity 100.0%; Pred. No. 1.9e-63; 

Matches 163; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

QY 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 201 

N I i I I i I I I I I I I I I i I I I I I I I I I I I I I I II I I I I I I I I II I I I | | | M | | | | | M | | 
Db 31 ATRPKLKKMKSQTGQVGEKQSLKCE1AAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKN 90 

QY 202 SRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 261 

1 ' I I I I I M I II I I I I I | || M | | || | || | | | | | | | | | | | | | | | | , | | 

Db 91 SRL Q™KVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYC 150 

Qy 262 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 

HI MIIIIIIIIIMIIIIIIIIIMIIMIIIIII 

Db 151 VNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 193 



RESULT 9 



US-10-096-241-6 

; Sequence 6, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

Busfield, Samantha J. 
TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
; COUNTRY: US 

; ZIP: 02110-2804 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 6: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 4 07 amino acids 
; TYPE: amino acid 

STRANDEDNESS : not relevant 
; TOPOLOGY: linear 

MOLECULE TYPE: protein 
; FRAGMENT TYPE: internal 

SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
US-10-096-241-6 

Query Match 48.1%; Score 842; DB 13; Length 407; 

Best Local Similarity 98.7%; Pred. No. 1.6e-60; 

Matches 156; Conservative 0; Mismatches 2; Indels 0; Gaps 0 
Qy 15 0 MKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 2 09 

111 MINIMI I M I M I I II M I I I I I I I II M I I I II I I I I I I 

Db 1 M^SQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 60 

Qy 210 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYY 269 

1 1 1 11 1 I M I I I I II M I I I I M I I I M I I M I I M I I I I II I I I IN 



61 KVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGflARKCNETAKSYCVNGGVCYY 120 

270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVL 307 

I I M I I I II I I I I | | | | | | | | | | | | | | | | | | | | || | 
121 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKHL 158 



RESULT 10 
US-10-096-241-2 

; Sequence 2, Application US/10096241 
; Publication No. US20020127594A1 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

Busfield, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET : 225 Franklin Street 
CITY: Boston 
STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096,241 
FILING DATE: 12-Mar-2O02 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 605 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
; FRAGMENT TYPE: internal 

SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
US-10-096-241-2 

Query Match 46.9%; Score 821; DB 13; Length 605; 

Best Local Similarity 97.4%; Pred. No. 1.4e-58; 



Db 

Qy 

Db 



Matches 151; Conservative 3; Mismatches 1; Inciels 0; Gaps 0; 

Q y 150 MKSQTGQVGEKQSLKCEAAAGNPQP5YRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKV 209 

N I I I I : I I I I | | | | | | | M I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | 
Db 1 MKSQTGEVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNVRKNSRLQFNKV 60 

Qy 210 KVEDAGE YVCEAENI LGKDTVRGRLYVNS VSTTLS SWS GHARKCNETAKS YCVNGGVCYY 2 69 

M MINIM III II I :| II M I II II M III I I I II || M 

Db 61 ^VEDAGEWCEAENILGKDTVRGRLHVNSVSTTLS SWS GHARKCNETAKS YCVNGGVCYY 12 0 

270 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 304 
M I M II I I II I I || || | | | M I I I I I II II I I II 
Db !21 IEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQ 155 



RESULT 11 
US-10-096-241-33 

; Sequence 33, Application US/10096241 
; Publication No. US2002 0127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE : FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/10/096, 241 

? FILING DATE: 12-Mar-2002 

? CLASSIFICATION: <Unknown> 

' PRIOR APPLICATION DATA: 

' APPLICATION NUMBER: 08/699,591 

' FILING DATE: 19-AUG-1996 

f ATTORNEY/AGENT INFORMATION: 

: NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

' REFERENCE/DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 33: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 13 9 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 



TOPOLOGY: linear 
; MOLECULE TYPE: protein 

SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
US-10-096-241-33 



Query Match 42.8%; Score 748; DB 13; Length 139; 

Best Local Similarity 98.6%; Pred. No. 2e-53; 

Matches 137; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 192 RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSSWSGHAR 251 
I M I I I I I I I I I I : I | | | | | | | | | | | || | | | | | : | | | | | | | | | | | | | | | | 

Db 1 MKYGNGRKNSRLQFNKVRVEDAGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHAR 60 

252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 311 
M I I I I I II I I I I I I I | || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | || 

Db 61 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 12 0 

Qy 312 GTGVSSSQWSTSPSTLDLN 330 

I I II I I I I I II I I I I I I I I 
Db 121 GTGVSSSQWSTSPSTLDLN 139 



RESULT 12 
US-08-736-019-170 

Sequence 170, Application US/08736019 
Publication No. US20030207799A1 
GENERAL INFORMATION: 

APPLICANT: Goodearl, Andrew 
APPLICANT: Stroobant, Paul 
APPLICANT: Minghetti, Luisa 
APPLICANT: Waterfield, Michael 
APPLICANT: Marchionni, Mark 
APPLICANT: Chen, Mario 
APPLICANT: Hiles, Ian 

TITLE OF INVENTION: GLIAL MITOGENIC FACTORS, THEIR 
TITLE OF INVENTION: PREPARATION AND USE 
NUMBER OF SEQUENCES: 18 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Clark & Elbing LLP 
STREET: 176 Federal Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: U.S.A. 
ZIP: 02110 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 
COMPUTER: IBM Compatible Pentium 
OPERATING SYSTEM: Windows95 
SOFTWARE: FastSeq Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/736, 019 
FILING DATE: 22-OCT-1996 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/471,833 
FILING DATE: 06-JUN-1995 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER: 08/036,555 
; FILING DATE: 24-MAR-1993 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/965,173 
FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 07/907,138 

FILING DATE: 30-JUN-1992 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 07/940,389 

; FILING DATE: 03-SEP-1992 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
; FILING DATE: 03-APR-1992 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: UK 91 07566.3 
; FILING DATE: 10-APR-1991 

; ATTORNEY/ AGENT INFORMATION : 

NAME: Bieker-Brady, Kristina 
REGISTRATION NUMBER: 39,109 
; REFERENCE/DOCKET NUMBER: 04585/00200Q 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (617) 428-0200 
; TELEFAX: (617) 428-7045 

TELEX : 

; INFORMATION FOR SEQ ID NO: 170: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 422 
; TYPE: amino acid 

STRANDEDNESS: 
; TOPOLOGY: linear 

US-08-736-019-170 

Query Match 29.9%; Score 523; DB 8; Length 422; 

Best Local Similarity 35.5%; Pred. No. 2.3e-34; 

Matches 124; Conservative 59; Mismatches 88; Indels 78; Gaps 13; 

15 GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS— NSTRE 59 



Db 



' ' ' m . i i i . i : i i i : | | | | | | : II: | | 

58 GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 




LVKVLDKWPLRSGGLQ 83 



Db 



117 PPAAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLV^ 176 



Qy 



84 REQVISV . GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 



Db 



1 1 ' ' ■ • i i i i f * I I : | I : : I I I : | | : | 

177 KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRN 235 



Qy 



129 LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 188 

I N II : : I I || I : I | : | | M | | : | | : : : : II I : I I I I I 

236 LKKEVSRVLCKRCALPPQLKEMKSQESAAGSKLVXRCETSSEYSSLRFKWFKNGNELNRK 2 95 

189 ---RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 

: : I : I : I : M:ll : I : I I I : I : : I I I : : : : | : | 
2 96 NKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTS 353 



Db 



Qy 



Db 



Qy 246 WSG — HARKCNETAKS YCVNGGVCYYI EGINQLS CKCPNGFFGQRC 289 

: i I ill I : : | | | | | | : : : : : | I I II I I | I | 
Db 354 TTGTSHLVKCAEKEKT FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 4 02 

RESULT 13 
US-09-795-668-3 

Sequence 3, Application US/09795668 
Patent No. US20020045577A1 
GENERAL INFORMATION: 
APPLICANT: Stefansson, Hreinn 
APPLICANT: Steinthorsdottir , Valgerdur 
APPLICANT: Gulcher, Jeffrey R. 
TITLE OF INVENTION: HUMAN SCHIZOPHRENIA GENE 
FILE REFERENCE: 2345.2 004-001 
CURRENT APPLICATION NUMBER: US/ 09/7 95, 668 
CURRENT FILING DATE: 2001-02-28 
PRIOR APPLICATION NUMBER: US 09/515,716 
PRIOR FILING DATE: 2000-02-28 
NUMBER OF SEQ ID NOS : 1531 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 
LENGTH: 418 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-795-668-3 

Query Match 29.8%; Score 521; DB 9; Length 418; 

Best Local Similarity 35.2%; Pred. No. 3.3e-34; 

Matches 122; Conservative 59; Mismatches 90; Indels 76; Gaps 12; 

GVS LACYS PS LKS VQDQAYKAPVWEGKV QGLV PAGGSSS— NSTREPP 61 

I I : I I I : I I I : I : I I I : I I I I I I : | | : | | | | 

GASV-CSPPSVGSVQEI^QRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDREPP 114 

ASGRVA LVKVLDKWPLRSGGLQRE 85 

Mil I I I I | ::: | | | ::: 

T^GPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVKAGGLKKD 174 

QVI SV GS CVPLERNQRYI FFLEP TEQPLVFKTAFAPLDTNGKNLK 130 

—I M I: : 11111:11 : I |: :| ||:| |:||| 



Ml :MI I I 1:11:1111 II |:|| :: ::|||:| MM 

KEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNR 

-RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSi 
: : I : I : M I I : M : I : I II : I : : I I | : : : : | : 
PQNIKIQKKPGK--SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATST! 

G-- HARKCNETAKS YCVNGGVCYYI EGINQLS CKCPNGFFGQRC 289 

I I M I l::MIM I: :: :: I Mill I I II 



Qy 


15 


Db 


56 


Qy 


62 


Db 


115 


Qy 


86 


Db 


175 


Qy 


131 


Db 


234 


QY 


189 


Db 


294 


QY 


248 


Db 


352 



RESULT 14 



US-09-795-686-3 

; Sequence 3, Application US/09795686 

; Patent No. US20020094 954A1 

; GENERAL INFORMATION: 

; APPLICANT: Stefansson, Hreinn 

; APPLICANT: Steinthorsdottir , Valgerdur 

; APPLICANT: Gulcher, Jeffrey R. 

; TITLE OF INVENTION: HUMAN SCHIZOPHRENIA GENE 

; FILE REFERENCE: 2 345.2005-001 

; CURRENT APPLICATION NUMBER: US/09/795,686 

; CURRENT FILING DATE: 2 001-02-28 

; PRIOR APPLICATION NUMBER: US 09/515,715 

; PRIOR FILING DATE: 2000-02-28 

; NUMBER OF SEQ ID NOS : 1531 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 418 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-795-686-3 

Query Match 29.8%; Score 521; DB 9; Length 418; 

Best Local Similarity 35.2%; Pred. No. 3.3e-34; 

Matches 122; Conservative 59; Mismatches 90; Indels 76; Gaps 12; 

QV 15 GVS LAC YS P S LKS VQDQAYKAPVWEGKV QGLV PAGGSSS— NSTREPP 61 

I I : I I I : I I I : I : I I I : I I I I I I : I I : | | | | 

Db 56 GASV-CSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDREPP 114 

Qy 62 ASGRVA LVKVLDKWPLRSGGLQRE 85 

I : i I | | | | | : : : | | | : : : 

Db 115 AAGPRALGPP7VEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVIiQWAVKAGGLKKD 174 

QY 8 6 QVISV GSCVPLERNQRYI FFLEP TEQPLVFKTAFAPLDTNGKNLK 130 

= • : I M I : : I II I I : I I : I I : : I I I : I I : I I I 

Db 175 SLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRNLK 233 

Qy 131 KEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS— 188 

Ml II I:||:|||| I I |:|| :: ::|||:| Mil 

Db 234 KEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNK 293 

Qy 189 -RDIRIKYGNGRKNSRLQFNKVT<vT:DAGEWCEA£NILGKDTVRGRLYWSVSTTLSSWS 247 

: : I : I : I : I I : I I : | : | | | : | : : | | | : : : : | : | : 

Db 294 PQNIKIQKKPGK — SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTSTT 351 

Qy 24 8 G — HARKCNETAKS YCVNGGVCYYI EGINQLS CKCPNGFFGQRC 289 



Db 352 GTSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 398 



RESULT 15 
US-09-946-807-3 

; Sequence 3, Application US/09946807 

; Patent No. US20020165144A1 

; GENERAL INFORMATION: 

; APPLICANT: Stefansson, Hreinn 

; APPLICANT: Steinthorsdottir, Valgerdur 



; APPLICANT: Gulcher, Jeffrey R. 

; TITLE OF INVENTION: HUMAN SCHIZOPHRENIA GENE 

; FILE REFERENCE: 2345.2004-001 

; CURRENT APPLICATION NUMBER: US/09/946,807 

; CURRENT FILING DATE: 2 001-09-05 

; PRIOR APPLICATION NUMBER: US/09/795, 668 

; PRIOR FILING DATE: 2001-02-28 

; PRIOR APPLICATION NUMBER: US 09/515,716 

; PRIOR FILING DATE: 2000-02-28 

; NUMBER OF SEQ ID NOS : 1531 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 418 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-946-807-3 



Query Match 29.8%; Score 521; DB 9; Length 418; 

Best Local Similarity 35.2%; Pred. No. 3.3e-34; 

Matches 122; Conservative 59; Mismatches 90; Indels 76; Gaps 12; 

2^ 15 GVS LAC YS P S LKS VQDQAYKAPVWEGKV QGLV PAGGSSS— NSTREPP 61 

I I : I I I : M I : I : I I I : | | | | | | : | | . , , , , 

Db 56 GASV-CSPPSVGSVQELAQRAAVVIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDREPP 114 

QY 62 ^ SGRVA LVKVLDKWPLRSGGLQRE 8 5 

' : ' ' I I I I I : : : I | I * • • 

Db 115 AAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVK^ 174 

Qy 86 QVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKNLK 130 

: : : 1 11 I : : M I I I : I I : I I : : I I I : I I : I I I 

Db 175 SLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRASFPPLET-GRNLK 233 

QY 131 KEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS — 188 

0 I I I : : I I II |:||:|||| I I | : | | : : : : I I I : I | | | | 

Db 234 KEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNK 293 

Qy 189 _RDIRIKYGNG ^NSRLQFNKVKVEDAGEWCEAENILGKDTVRGRLYWSVSTTLSSWS 247 

::|:|: I s I I: II : |:|||:|: : || |: : : : | :| : 

Db 294 PQNIKIQKKPGK SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNATSTSTT 351 

2^ 248 G--HARKCNETAKSYCVNGGVCYYIEGINQLS — CKCPNGFFGQRC 289 

I::|MII I: :: :: | ||||| | | || 

Db 352 GTSHLVKCAEKEKT FCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRC 398 



Search completed: August 17, 2004, 14:22:29 
Job time : 45.6656 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



August 17, 2004, 14:05:35 ; Search time 37.3089 Seconds 

(without alignments) 
2790.781 Million cell updates/sec 

US-09-864-675-2 
1749 

1 MRRD PAP G F S ML L FG VS LAC PGTGVSSSQWSTSPSTLDLN 330 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1017041 



Database 



SPTREMBL 25 :* 



1: 


sp_archea : * 


2: 


sp_bacteria : * 


3: 


sp_f ungi : * 


4: 


sp_human : * 


5: 


sp_invertebrate : * 


6: 


sp_mammal : * 


7: 


sp_mhc : * 


8: 


sp_organelle: * 


9: 


sp_phage : * 


10: 


sp_plant : * 


11: 


sp_rodent : * 


12: 


sp_virus : * 


13: 


sp_vertebrate: * 


14: 


sp_unclassif ied: * 


15: 


sp_rvirus : * 


16: 


sp_bacteriap : * 


17: 


sp_archeap : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and xs derived by analysis of the total score distribution. 



SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 



1 


485 


27.7 


782 


11 Q9ESA5 


2 


460 


26.3 


342 


11 09ESA1 


3 


452 


25. 8 


323 


11 Q9ESA2 


4 


450 


25. 7 


317 


11 09ESA3 


5 


444 


25.4 


79 


11 Q810X2 


6 


404.5 


23.1 


348 


4 Q8NFN3 


7 


290 


16.6 


461 


11 035947 


8 


271 


15.5 


241 


6 Q07112 


9 


234 


13.4 


211 


11 Q8BKI8 


10 


221 


12.6 


244 


11 09ESA4 


11 


200 


11.4 


54 


11 Q810X1 


12 


180.5 


10.3 


167 


4 Q8NFN2 


13 


166 


9.5 


8625 


5 Q86GD6 


14 


164.5 


9.4 


5175 


5 Q8I0L3 


15 


164.5 


9.4 


5198 


5 076518 


16 


158 


9.0 


1106 


4 Q8WX93 


17 


153 


8.7 


8943 


5 09V4F7 


18 


151 


8.6 


298 


11 Q9JI59 


19 


151 


8.6 


298 


11 08C5K9 


20 


151 


8.6 


410 


4 Q8N1M2 


21 


151 


8.6 


1323 


13 008476 


22 


150.5 


8.6 


6658 


5 076281 


23 


150 


8.6 


298 


11 Q8CE95 


24 


150 


8.6 


3950 


6 07YRF5 


25 


149 


8.5 


512 


4 096DN8 


26 


149 


8.5 


5636 


4 Q96RW7 


27 


146.5 


8.4 


330 


13 O90Z42 


28 


146.5 


8.4 


4076 


11 O7TN00 


29 


144.5 


8.3 


754 


11 Q8BZ76 


30 


144.5 


8.3 


858 


5 018466 


31 


144.5 


8.3 


7962 


4 Q10465 


32 


144.5 


8.3 


34350 


4 Q8WZ42 


33 


144 


8.2 


338 


4 Q8IV4 9 


34 


143.5 


8.2 


507 


4 Q96K90 


35 


143.5 


8.2 


1320 


4 Q96KF5 


36 


143.5 


8.2 


1320 


4 086TC9 


37 


143.5 


8.2 


1391 


4 Q8N3L4 


38 


143.5 


8.2 


1612 


11 089026 


39 


143.5 


8.2 


1651 


4 Q9Y6N7 


40 


142.5 


8.1 


947 


5 Q26262 


41 


142.5 


8.1 


947 


5 044171 


42 


142 


8.1 


2693 


5 Q8ISF3 


43 


142 


8.1 


2708 


5 Q8ISF4 


44 


142 


8.1 


2780 


5 Q8MNS0 


45 


142 


8.1 


2808 


5 Q8MNS1 



Q9esa5 rattus norv 
Q9esal rattus norv 
Q9esa2 rattus norv 
Q9esa3 rattus norv 
Q810x2 mus musculu 
Q8nfn3 homo sapien 

035947 mesocricetu 
Q07112 bos taurus 
Q8bki8 mus musculu 
Q9esa4 rattus norv 
Q810xl mus musculu 
Q8nfn2 homo sapien 
Q86gd6 procambarus 
Q8i013 caenorhabdi 
076518 caenorhabdi 
Q8wx93 homo sapien 
Q9v4f7 drosophila 
Q9ji59 mus musculu 
Q8c5k9 mus musculu 
Q8nlm2 homo sapien 

Q08476 gallus gall 
076281 drosophila 

Q8ce95 mus musculu 
Q7yrf5 canis famil 
Q96dn8 homo sapien 
Q96rw7 homo sapien 
Q90z42 gallus gall 
Q7tn00 rattus norv 
Q8bz76 mus musculu 
018466 hirudo medi 
Q10465 homo sapien 
Q8wz42 homo sapien 
Q8iv49 homo sapien 
Q96k90 homo sapien 
Q96kf5 homo sapien 
Q86tc9 homo sapien 
Q8n314 homo sapien 

08 9026 mus musculu 
Q9y6n7 homo sapien 
Q26262 caenorhabdi 
044171 caenorhabdi 
Q8isf3 caenorhabdi 
Q8isf4 caenorhabdi 
Q8rrmsO caenorhabdi 
Q8mnsl caenorhabdi 



ALIGNMENTS 



RESULT 1 
Q9ESA5 

ID Q9ESA5 PRELIMINARY; PRT; 782 AA. 

AC Q9ESA5 ; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 



RT 
RT 



DE Glial growth factor beta la (Fragment) . 

GN NRG1 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Spinal cord, and Brain stem; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R.J Jr 

RA Frohnert P.W.; ' 

"Structural and Functional Diversity of Glial Growth Factor Isoforms 
Expressed in Regenerating Peripheral Nerve and Associated Neurons »■ 

RL Submitted (OCT-1999) to the EMBL/ GenBank/DDBJ databases 

DR EMBL; AF194993; AAG28433.1; -. 

DR HSSP; Q12784; 1HRE. 

DR GO; GO: 0005102; F: receptor binding; IEA. 

DR GO; GO: 0005351; F: sugar porter activity; IEA. 

DR GO; GO: 0009790; P: embryonic development; IEA. 

DR GO; GO: 0009401; P : phosphoenolpyruvate-dependent sugar phospho. . ; IEA 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002114; HPrjSerPjS. 

DR InterPro; IPR006210; IEGF. 

InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2 . 
InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 

DR PRINTS; PRO 108 9; NEUREGULIN. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS00589; PTS_HPR_SER; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NONJTER 1 1 

SQ SEQUENCE 782 AA; 86036 MW; F617 4A68F4E27BDE CRC64; 

Query Match 27.7%; Score 485; DB 11; Length 782; 

Best Local Similarity 32.4%; Pred. No. 5.3e-34; 

Matches 118; Conservative 57; Mismatches 87; Indels 102; Gaps 12- 



DR 
DR 
DR 



Qy 


23 


II" Ml: 1 : 1 1 1 : 1 1 1 1 1 

PSVGSVQELARRAAWIEGKVHPPRRQQGALDRKAAGEAGAGARDQPVQDSPPSQDPLPA 


45 


Db 


1 


60 


Qy 


46 


— LVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV GS 
: 1 1 1 : : : 1 II I 1 1 : : : I I | : : : : : : | | 
VNWTLPTGGPEPST— DQPGDPAPYLVKVHQWAVKAGGLKKDSLLTVRLDTWGHPAFPS 


92 


Db 


61 


118 


Qy 


93 


CVPLERNQRYI FFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 

1 1 : : M 1 1 1 : 1 1 | 1 : : 1 | | : | | : | | M II : : : | 

CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRNLKKEVSRVLCKRCALPPRL 


147 


Db 


119 


177 


Qy 


148 


KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNGRKNSRL 

1=1 Ml II |:|| :: =:|||:| |||| :|:|: | : | | 


204 



UD 


178 


Qy 


205 


UD 


236 


Qy 


237 


Db 


296 


Qy 


286 


Db 


356 



17 8 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK--SEL 235 

'NKVKVEDAGEYVCEAENILGKDTVRGRL yv 23 6 

I I : I : I I I : I : : I I I : • i i 



-NSVSTTLSSWSG HARKCNETAKS YCVNGGVCYYIEGINQLS CKCPNGFF 285 

1 : I : I : I : i I M I I : : I I I I I I : : : : : | Mill I 



I I I 



RESULT 2 
Q9ESA1 

ID Q9ESA1 PRELIMINARY; PRT; 342 AA 

AC Q9ESA1; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor GGF beta 4 (Fragment) . 

GN NRG1 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus 

OX NCBI_TaxID=10116; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; 

RC TISSUE=Axotomized lumbar dorsal root ganglion/spinal cord; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R.J Jr 

RA Frohnert P.W.; *' 

"Structural and Functional Diversity of Glial Growth Factor Isoforms 
Expressed in Regenerating Peripheral Nerve and Associated Neurons »• 
Submitted (OCT-1999) to the EMBL/ GenBank/DDBJ databases. " ' 

DR EMBL; AF194997; AAG28451.1; -. 

DR HSSP; Q12784; 1HRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1 ; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NON_TER 1 l 

FT NON_TER 342 342 

SQ SEQUENCE 342 AA; 37836 MW; 8BE36FC836553124 CRC64; 

Query Match 2 6.3%; Score 460; DB 11; Length 342; 

Best Local Similarity 33.8%; Pred. No. 2.8e-32; 

Matches 106; Conservative 54; Mismatches 92; Indels 62; Gaps 10; 



RT 
RT 
RL 



Q y 43 VQGLVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV — GS 92 

I : I 'I : : : I I I I I I : : : I II : : : : : : I I 

Db 5 VNWTLPTGGPEPST--DQPGDPAPYLVXVHQWAVKAGGLKKDSLLTVRLDTWGHPAFPS 62 

Qy 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

' 1—11111:11 I I: :! M:l hi I II || :: I I I I |:| 

Db 63 CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRNLKKEVSRVLCKRCALPPRL 121 

QY 148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS---RDIRIKYGNGRKNSRL 204 

1 : I N I II : : I I I : I I | I I : I : I : I : | | 

Db 122 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK — SEL 179 

QY 2 05 QFNKVKVEDAGE YVCEAENI LGKDTVRGRL YV — 236 

: M : : || |: : || 

Db 180 RINKASIADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSESPIRIS 239 

Q ^ 237 NSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPNGFF 285 

I : I : I : I : I I Ml I : : I I I I I I : : : : : | I I I I I | 
Db 240 VSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 299 

Qy 286 GQRCLEKLPLRLYM 299 

Ml : II 
Db 300 GDRCQNYVMAS FYM 313 



RESULT 3 
Q9ESA2 

ID Q9ESA2 PRELIMINARY; PRT; 323 AA. 

AC Q9ESA2 ; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor GGF beta 3 (Fragment) . 

GN NRG1 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R J Jr 

RA Frohnert P.W. ; * ' 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons "; 

RL Submitted (OCT-1999) to the EMBL/ GenBank/DDBJ databases 

DR EMBL; AF194996; AAG28450.1; -. 

DR HSSP; Q12784; 1HRE . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; I EGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 



DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW EGF-like domain; Immunoglobulin domain. 

FT NON_TER 1 1 

SQ SEQUENCE 323 AA; 35358 MW; C7DF153A939A80C8 CRC64; 

Query Match 25.8%; Score 452; DB 11; Length 323; 

Best Local Similarity 34.2%; Pred. No. 1.3e-31; 

Matches 104; Conservative 52; Mismatches 86; Indels 62; Gaps 10; 

Qy 43 VQGLVPAGGSSSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISV GS 92 

I : I I I : : : I I I I I I : : : I I I : : : : : : | . | 

Db 5 VNWTLPTGGPEPST DQPGDPAPYLVXVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPS 62 

Qy 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

c ' I : : I I I I I = I I I I : : I I I : I | : | | | | | | : : | | | | | : | 

Db 63 CGRLKEDSRYIFFMEPDANSSGRAPPAFRASFPPLET-GRNLKKEVSRVLCKRCALPPRL 121 

Qy 148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNGRKNSRL 204 

, ' :||M 11 1:11 :: ::M 1:1 Mil :|:|: |: | | 

Db 122 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK — SEL 179 

Qy 205 QFNKVKVEDAGEYVCEAENI LGKDTVRGRL YV 236 

= II l:IM:|: : || |: : | | 

Db 180 R1NKASPADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSESPIRIS 239 

Q y 237 NSVSTTLSSWSG HARKCNETAKSYCYNGGVCYYIEGINQLS CKCPNGFF 285 

I : I : I : I : I I I | | I : : M I I I I : : : : : | I I I I I I 
240 VSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 299 



Db 



Qy 286 GQRC 289 

I I I 

Db 300 GDRC 303 



RESULT 4 
Q9ESA3 

ID Q9ESA3 PRELIMINARY; PRT; 317 AA. 

AC Q9ESA3 ; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Glial growth factor GGF beta 2 (Fragment) . 

GN NRG1 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; 

RC TISSUE=Axotomized lumbar dorsal root ganglion/spinal cord; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R.J Jr 

RA Frohnert P.W. ; ' ' 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed in Regenerating Peripheral Nerve and Associated Neurons."; 

RL Submitted (OCT-1999) to the EMBL/ GenBank/DDBJ databases. 



EMBL; AF194995; AAG28449.1; 



DR 

DR HSSP; Q12784; 1HRE . 
DR 
DR 



InterPro; IPR006209; EGF_like. 
InterPro; IPR000886; ER_target S. 



DR 
DR 
DR 
DR 



DR InterPro; IPR006210; IEGF. 

InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
Pfam; PF00008; EGF; 1. 
Pfam; PF00047; ig; 1. 
DR SMART; SM00181; EGF; 1. 
DR SMART; SM00408; IGc2; 1. 
DR PROSITE; PS00022; EGFJL; 1. 
DR PROSITE; PS00014; ER_TARGET; 1. 
DR PROSITE; PS50835; IG_LIKE; 1. 
KW EGF-like domain; Immunoglobulin domain. 
FT NON_TER 1 1 

FT NONJTER 317 317 

SQ SEQUENCE 317 AA; 34785 MW; 44 87FA3E9CD876B9 CRC64; 

Query Match 25.7%; Score 450; DB 11; Length 317; 

Best Local Similarity 33.9%; Pred. No. 2e-31; 

Matches 103; Conservative 54; Mismatches 's5; Indels 62; Gaps 10; 

^ 43 v QGLVPAGGSSSNSTREPPASGRVALVICv"LDKWPLRSGGLQREQVISV GS 92 

1 : I I I : : : | | | | | | : : : || | : : : : : : | | 

5 VNWTLPTGGPEPST— DQPGDPAPYLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPS 62 



Db 



QY 93 CVPLERNQRYIFFLEPT EQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKL 147 

„ 1 1 : : I I I I I : I I | I : : I | | : | | :: | II I I :: I | | | | : | 

Db 63 cgrlkedsryif ™epdanssgrappafrasfpplet-grdlkkevsrvlckrcalpprl 121 

Qy 14 8 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS— RDIRIKYGNGRKNSRL 204 

1:1 II I II 1:11 :: ::|||:| MM :|:|: | : | | 

Db 122 KEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPGK — SEL 179 

QY 205 QFNKVKVEDAGEYVCEAENILGKDTVRGRL YV o-ac 

= I I : I : I I I : I = = I I I : : ™ 

Db 180 ^INKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSESPIRIS 239 

Qy 237 NSVSTTLSSWSG HARKCNETAKSYCVNGGVCYYIEGINQLS CKCPNGFF 285 

I : I : I : I : I I Ml I : : M M I I: : : : : I I M I I I 
240 VSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCPNEFT 299 



Db 



Qy 286 GQRC 289 

I I I 

Db 300 GDRC 303 



RESULT 5 
Q810X2 

ID Q810X2 PRELIMINARY; PRT; 79 AA 

AC Q810X2; 

DT 01-JUN-2003 (TrEMBLrel. 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel . 25, Last annotation update) 

DE Neuregulin 2-alpha (Fragment) . 

OS Mus musculus (Mouse) . 



oc 
oc 
ox, 

RN 
RP 
RC 
RA 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
FT 
SQ 



Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Rodentia; Sciurognathi ; Muridae; Murinae; Mus , 



Eukaryota; Metazoa; 
Mammalia; Eutheria; 
NCBI__TaxID=10090; 
[1] 

SEQUENCE FROM N.A. 

STRAIN=CD-1; TISSUE=01 factory bulb; 
Mautino B. , Dalla Costa L., Dati C; 

"Bioactive recombinant NRG1 , NRG2 and NRG3 expressed in E coli 

Submitted (JAN-2003) to the EMBL/ GenBank/DDBJ databases 

EMBL; AY227025; AA072522.1; 

InterPro; IPR006209; EGF_like. 

Pfam; PF00008; EGF; 1. 

PROSITE; PS00022; EGF_1; 1. 

PROSITE; PS01186; EGF_2; 1. 

NON_TER 1 i 

SEQUENCE 79 AA; 8727 MW; DA4501900C61078 0 CRC64; 



Query Match 25.4%; Score 444; DB 11; 

Best Local Similarity 100.0%; Pred. No. le-31;' 
Matches 79; Conservative 0; Mismatches 



Length 79; 



0; Indels 



0; Gaps 



0; 



Qy 

Db 

Qy 

Db 



252 K< 



CNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTP 311 

, i MMMIII| llilllllll|||MIMIIIII|||||||||||MIII||||||M|| 

1 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMPDPf 



PKQSVLWDTP 60 



312 GTGVSSSQWSTSPSTLDLN 330 
M I I I I I I I I I I I I I I I I I 
61 GTGVSSSQWSTSPSTLDLN 79 



RESULT 6 
Q8NFN3 

ID Q8NFN3 PRELIMINARY; PRT; 348 AA 

AC Q8NFN3; 

DT 01-OCT-2002 (TrEMBLrel. 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Neuregulin 1 isoform GGF2 (Fragment) . 

GN NRG1 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Stefansson H., Sigurdsson E., Steinthorsdottir V., Bjornsdottir S 

RA Sxgmundsson T., Ghosh S., Brynjolfsson J., Gunnarsdottir S 

^ i V H rS r^°" Ch ° U T ' T -' H i altason °-> Birgisdottir B., JonsLn H. , 

RA Gudnadottxr V.G., Gudmundsdottir E., Bjornsson A., Ingvarsson B. , 

RA ingason A Sigfusson s., Hardardottir H., Harvey R.P., Brunner D . , 

RA Mutel V., Gonzalo A., Lemke G. , Sainz J., Johannesson G., 

S ST° n r T ;'», GUdb T j : rtSSOn D> ' Manolescu A., Frigge M.L., Gurney M.E, 

Kon 9 ' Gulcher J.R., Petursson H., Stefansson K. ; 

RT "Neuregulin 1 and susceptibility to Schizophrenia."; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases 

DR EMBL; AF491780; AAM71140.1; -. 

DR InterPro; IPR003599; Ig. 



DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW Immunoglobulin domain. 

FT NONJTER 348 34 8 

SQ SEQUENCE 348 AA; 36997 MW; 155 68C62 60C5635C CRC64; 

Query Match 23.1%; Score 404.5; D B 4; Length 348; 

Best Local Similarity 34.4%; Pred. No. 2.3e-27; 

Matches 100; Conservative 49; Mismatches 69; Indels 73; Gaps 11; 



Qy 



-NSTRE 59 



15 GVSLACYS — PSLKSVQDQAYKAPVWEGKV QGLV PAGGSSS- 

1 1= IN II: III: I :| II: : , , . ""T 

Db 58 GASV-CYSSPPSVGSVQELAQRAAWIEGKVHPQRRQQGALDRKAAAAAGEAGAWGGDRE 116 

QY 60 PPASGRVA _ LVKVLDKWPLRSGGLQ 83 

Db 117 PPAAGPRALGPPAEEPLLAANGTVPSWPTAPVPSAGEPGEEAPYLVKVHQVWAVT^AGGLK 176 

Qy 84 REQVISV GSCVPLERNQRYIFFLEP TEQPLVFKTAFAPLDTNGKN 128 

N I : I I I I I : I 

FPPLET-GRN 235 



Db 177 KDSLLTVRLGTWGHPAFPSCGRLKEDSRYIFFMEPDANSTSRAPAAFRAS-- ' ' ' ' ' ' ' 



Qy 129 LKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS 188 

I I I I I :: I I II I : I I : I I I I l| I : | | : : : : I I I • I I I M 

Db 236 LKKEVSRVLCKRCALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRK 295 

Qy 189 ^DIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYV 236 

: : I : I : I : I I : | | : | : | | | : | : : | | | : : : 

Db 296 NKPQNIKIQKKPGK— SELRINKASLADSGEYMCKVISKLGNDSASANITI 344 



RESULT 7 
035947 

ID 035947 PRELIMINARY; PRT; 461 AA 

AC 035947; 

DT 01-JAN-1998 (TrEMBLrel. 05, Created) 

DT 01-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Pro-neuregulin-1, isoform alpha 2B precursor 

GN NRG1 OR NDF. 

OS Mesocricetus auratus (Golden hamster) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Euthena; Rodentia; Sciurognathi ; Muridae; Cricetinae; 

OC Mesocricetus. 

OX NCBI_TaxID=10036; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM ALPHA2B) , AND SEQUENCE OF 64-81 

RC TISSUE=EMBRYO; 



RX MEDLINE=98196996; PubMed=9537646; 
RA Velasco J. A., Feijoo E. , Avila M.A., Notario V. ; 

"Secretion of neu differentiation factor-like polypeptides by cph- 
transformed fibroblasts: cloning and characterization of Syrian 

hamster neuregulin cDNAs . " ; 



RT 
RT 
RT 



RL Mol. Carcinog. 21:156-163(1998). 

CC -!- FUNCTION: DIRECT LIGAND FOR ERBB3 AND ERBB4 TYROSINE KINASE 
CC RECEPTORS. CONCOMITANTLY RECRUITS ERBBl AND ERBB2 CORECEPTORS, 

CC RESULTING IN LIGAND- STIMULATED TYROSINE PHOSPHORYLATION AND 

CC ACTIVATION OF THE ERBB RECEPTORS. MAY PLAY AN IMPORTANT ROLE IN 

CC PROVIDING GROWTH ADVANTAGE IN NEOPLASTIC CELLS. 

CC -!- SUBUNIT: THE CYTOPLASMIC DOMAIN INTERACTS WITH THE LIM DOMAIN 
CC REGION OF LIMK1 (BY SIMILARITY) . 

CC -!- SUBCELLULAR LOCATION: EXISTS AS TYPE I MEMBRANE PROTEIN AND AS A 
CC PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE MEMBRANE- 

CC BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) 

CC TISSUE SPECIFICITY: EXPRESSED AT HIGHER LEVEL AFTER NEOPLASMIC 

CC TRANSFORMATION OF CELLS. 

CC -!- DOMAIN: THE CYTOPLASMIC DOMAIN MAY BE INVOLVED IN THE REGULATION 
CC OF TRAFFICKING AND PROTEOLYTIC PROCESSING. REGULATION OF THE 

CC PROTEOLYTIC PROCESSING INVOLVES INITIAL INTRACELLULAR DOMAIN 

CC DIMERIZATION (BY SIMILARITY) . 

CC -!- DOMAIN: ERBB RECEPTOR BINDING IS ELICITED ENTIRELY BY THE EGF-LIKE 
CC DOMAIN . 

CC -!- PTM: PROTEOLYTIC CLEAVAGE CLOSE TO THE PLASMA MEMBRANE ON THE 
CC EXTERNAL FACE LEADS TO THE RELEASE OF THE SOLUBLE GROWTH FACTOR 

CC FORM (BY SIMILARITY) . 

CC -!- PTM: EXTENSIVE GLYCOSYLATION PRECEDES PROTEOLYTIC CLEAVAGE (BY 
CC SIMILARITY) . 

CC SIMILARITY: CONTAINS 1 EGF-LIKE DOMAIN. 

CC -!- SIMILARITY: CONTAINS 1 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAIN 
CC -!- SIMILARITY: BELONGS TO THE NEUREGULIN FAMILY 
DR EMBL; U96612; AAB71812.1; -. 
DR HSSP; Q12784; 1HRE. 

DR GO; GO: 0016021; C: integral to membrane; IEA. 
DR GO; GO: 0008083; F: growth factor activity; IEA. 
DR GO; GO: 0009790; P:embryonic development; IEA. 
DR InterPro; IPR006209; EGF_like. 
DR InterPro; IPR006210; IEGF. 
DR InterPro; IPR007110; Ig-like. 
DR InterPro; IPR003598; Ig_c2. 
DR InterPro; IPR002154; Neuregulin. 
DR Pfam; PF00008; EGF; 1. 
DR Pfam; PF00047; ig; 1. 
DR Pfam; PF02158; Neuregulin; 1. 
DR PRINTS; PR01089; NEUREGULIN. 
DR SMART; SM00181; EGF; 1. 
DR SMART; SM00408; IGc2; 1. 
DR PROSITE; PS00022; EGF_1; 1. 
DR PROSITE; PS01186; EGF_2; 1. 
DR PROSITE; PS50835; IG_LIKE; 1. 

Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 
Transmembrane; Alternative splicing. 

BY SIMILARITY. 

PRO-NEUREGULIN-1, MEMBRANE- BOUND FORM. 
NEUREGULIN- 1. 

EXTRACELLULAR (POTENTIAL) . 
INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 
CYTOPLASMIC (POTENTIAL) . 
IG-LIKE C2-TYPE DOMAIN. 
SER/THR-RICH. 
EGF-LIKE. 



KW 
KW 



FT 


PROPEP 


1 


13 


FT 


CHAIN 


14 


461 


FT 


CHAIN 


14 


241 


FT 


DOMAIN 


14 


242 


FT 


TRANSMEM 


243 


265 


FT 


DOMAIN 


266 


461 


FT 


DOMAIN 


50 


119 


FT 


DOMAIN 


165 


177 


FT 


DOMAIN 


178 


222 



FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



DISULFID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
SEQUENCE 



57 
182 
190 
212 
73 
120 
126 
164 
4 61 AA; 



112 
196 
210 
221 

73 
120 
126 
164 

50890 



BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 



•> ( 
■) 
.) 



MW; 935C9560F7148336 CRC64; 



(POTENTIAL) 
POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 



Query Match i 6 .6%; Score 290; DB 11; Length 461; 

Best Local Similarity 33.2%; Pred. No. 4.6e-17; 
Matches 65; Conservative 33; Mismatches 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



58; Indels 40; Gaps 5; 

142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELN-RSRDIRIKYGNGRK 200 
I I : I I : I I I II I : I I : : | : : : | | | : | | | | | : : | | 

34 ALPPRLKEMKIQESAAGSKLVLRCETSSEYPELRFKWFKNGSELNKRTKPQNIKLQKKPG 93 



-RLYV 236 



201 NSRLQFNKVKVEDAGEYVCEAENILGKDTVRG- 

I I I I : I : I I I : I : : I I | : , , , 
94 KSELRINKASLADSGEYMCKVISKLGNDSASANITIVDSNEFITGMPASTERAYVSSESP 153 

237 NSVSTTLSSWSG HARKCNETAKSYCVNGGVCYYIEGINQLS CKCP 281 

_ _ I: |:| :| :| | || | | : : | | | | | | : : : : : | , ,, 

154 IRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKTFCVNGGECFKVKDLSNPSRYLCKCQ 213 

282 NGFFGQRCLEKLPLRL 297 

II I I I I : | : : : 

214 PGFTGARCTENVPMKV 229 



RESULT 8 
Q07112 

ID Q07112 PRELIMINARY; PRT; 241 AA 

AC Q07112; 

DT 01-JAN-1998 (TrEMBLrel. 05, Created) 

DT 01-JAN-1998 (TrEMBLrel . 05, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25 f Last annotation update) 

DE Glial growth factor. 

GN GGFBPP5. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
OC Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 
OC Bovidae; Bovinae; Bos. 
OX NCBI_TaxID=9913; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Posterior pituitary; 

RX MEDLINE=93205115; PubMed=8 096067 ; 

RA Marchionni MA Goodearl A.D.G., Chen M. , Bermingham-McDonogh o., 
RA Kirk C, Hendricks M. , Danehy F. , Misumi D., Sudhalter J 
RA Kobayashi K. , Wroblewski D. , Lynch C, Baldasarre M. , Hiles I., 
RA Davis J.B., Hsuan J., Totty N.F., Otsu M. , McBurney R.N. 
RA Waterfield M.D., Stroobant P., Gwynne D. ; 

"Glial growth factors are alternatively spliced erbB2 ligands 
expressed m the nervous system."; 
Nature 362:312-318(1993). 



RT 
RT 
RL 



DR 
DR 



DR EMBL; L12259; AAA30540.1; 

DR PIR; S32359; S32359. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2 . 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; l. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM004 08; IGc2; 1. 

DR PROSITE; PS00022; EGF 1; 1. 

DR PROSITE; PS01186; EGF~2 ; FALSE NEG. 

DR PROSITE; PS50835; IG_LIKE; 1. 

SQ SEQUENCE 241 AA; 25955 MW; BF571297E8DA9796 CRC64; 

Query Match 15. 5%; Score 271; DB 6; Length 241; 

Best Local Similarity 32.6%; Pred. No. 9.1e-16; 

Matches 62; Conservative 34; Mismatches 50; Indels 44; Gaps 6- 



Qy 


142 


Db 


34 


Qy 


199 


Db 


94 


Qy 


237 


Db 


152 


Qy 


280 


Db 


212 



I IMhllll | | |:|| :: ::|||:| || 



: I : 



j o r\±j\*JE im in. v j\ vZjUSUjIl, i V UKAJiN I LGKDTVRGRL YV 21f> 

I 1= :| : 1:11 1:1: : II I: : 



-NSVSTTLSSWSG HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

1= 1:1 :| :| I || | I : : | I I I I I : : : : : I II 



AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RX 
RA 
RA 
RT 
RT 



PRT; 



211 AA. 



RESULT 9 
Q8BKI8 

ID Q8BKI8 PRELIMINARY; 
Q8BKI8; 

01-MAR-2003 (TrEMBLrel. 23, 
01-MAR-2003 (TrEMBLrel. 23, 
01-OCT-2003 (TrEMBLrel. 25, 
NEUREGULIN-1. 
NRG1. 

Mus musculus (Mouse) . 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
NCBI_TaxID=10090; 
[1] 

SEQUENCE FROM N.A. 
STRAIN=C57BL/6J; TISSUE=Eye; 
MEDLINE=22354 683; PubMed=124 66851 ; 
The FAN TOM Consortium, 

the RIKEN Genome Exploration Research Group Phase I & II Team- 

lTnn Y n S± /^\ the transcri P to ^ based on functional annotation of 

60,770 full-length cDNAs."; 



Created) 

Last sequence update) 
Last annotation update) 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus. 



DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



RL Nature 420:563-573(2002). 
DR EMBL; AK051824; BAC34784."l; -. 
DR MGD; MGI: 96083; Nrgl. 

GO; GO: 0005737; C: cytoplasm; IDA. 

GO; GO:0005887; C:integral to plasma membrane; IDA. 
GO; GO:0045202; C:synaptic junction; IDA. 
GO; GO: 0005176; F:Neu/ErbB-2 receptor binding; IDA. 
GO; GO:0016477; P:cell migration; IGI . 
GO; GO:0000902; P:cellular morphogenesis; IDA. 
GO; GO: 0010001; Ps glial cell differentiation; IMP. 
GO; GO: 0007507; P:heart development; IDA. 
GO; GO: 0007626; P:locomotory behavior; IMP 
DR GO; GO: 0000165; P:MAPKKK cascade; IDA. 
DR GO; GO: 0007517; P:muscle development; IMP 
DR GO; GO: 0042055; P:neuronal lineage restriction; IMP 
DR GO; GO:0045213; P : neurotransmitter receptor metabolism; IMP. 
DR GO; GO:0007422; P:peripheral nervous system development; IMP 

DR Go' SZilllV V*°*^™ regulation of protein kinase activity; IDA. 

DR GO; GO: 0046579; P:po S1 tive regulation of RAS protein signal t. • IDA 

DR GO; GO:0045595; P:regulation of cell differentiation; MP " " 

DR GO; GO: 0007416; P : synaptogenesis ; IMP. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR Pfam; PF00047; ig; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS50835; IG__LIKE; 1. 

SQ SEQUENCE 211 AA; 22893 MW; 75D3674B988BE0D3 CRC64; 

Query Match 13.4%; Score 234; DB 11; Length 211; 

Best Local Similarity 31.7%; Pred. No. 1.4e-12- 

Matches 57; Conservative 32; Mismatches 47; Indels 44; Gaps 6; 

Qy 142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNR SRDIRIKYGNG 198 

1 I : I I : I I I I I I I : I I : : : : | | | : | | | | | 1 . 1 

Db 34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRRNKPQNVKIQKKPG 93 

Q y 199 RKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRG 



-RLYV 236 

1 I : II : I : I ! I : I : : II I : 

Db 94 K— 



I 1= II : |:|||:|: : M |: , 
-SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNDLTTGMSASTERPYVSSE 151 

Qy 237 NSVSTTLSSWSG HARKCNETAKSYCVNGGVCYYIEGINQLS CK 279 

I : I : I : I : I | | | | I : : | | | | | | 1 11 

Db 152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCWGGECFMVXDLSNPSRYLCK 211 



RESULT 10 
Q9ESA4 

ID Q9ESA4 PRELIMINARY; PRT; 244 AA 

AC Q9ESA4; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last annotation update) 

DE Glial growth factor (Fragment) . 

GN NRG1 . 

OS Rattus norvegicus (Rat) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus . 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley ; 

RA Carroll S.L., Stonecypher M.S., Anderson K.D., Pearson R.J. Jr., 

RA Frohnert P.W. ; 

RT "Structural and Functional Diversity of Glial Growth Factor Isoforms 

RT Expressed m Regenerating Peripheral Nerve and Associated Neurons."; 

RL Submitted (OCT-1999) to the EMBL/ GenBank/DDB J databases 

DR EMBL; AF194994; AAG28434.1; 

FT NON_TER 244 244 

SQ SEQUENCE 244 AA; 25866 MW; 019CBC2DFFF8F625 CRC64; 

Query Match 12.6%; Score 221; DB 11; Length 244; 

Best Local Similarity 29.5%; Pred. No. 2.5e-ll; 

Matches 62; Conservative 28; Mismatches 44; Indels 76; Gaps 8; 

Qy 5 PAPGFSMLLFGVSL AC YS — P SLKS VQDQAYKAP WVEGKV 43 

' ' : I I I : III I I : M | : | : | | | - | | | | 

Db 38 PPPLLLLLLLGTA ^PGAAAERAAPAGASVCYSSPPSVGSVQELARRAAWIEGKV^ 97 

Qy 44 ?G LVPAGGSSSNSTREPPASGRV 66 

II : | | | ... | 

Db 98 ^RQQ^^DRK^^GEAGAGARDQPVQDSPPSQDPLPAWWTLPTGGPEPST — DQPGDPAP 155 

QY 67 ALVKVLDKWPLRSGGLQREQVISV GSCVPLERNQRYIFFLEPT EQ 111 

I I I I I ::: I | | :::::: | | | I : : | | I I I : I I 

156 YLVKVHQVWAVKAGGLKKDSLLTVRLDTWGHPAFPSCGRLKEDSRYIFFMEPDANSSGRA 215 



Db 



112 PLVFKTAFAPLDTNGKNLKKEVGKI LCTDC 141 
I I : : I I I : I I : I II I I I : : I | | 
Db 216 PPAFRASFPPLET-GRNLKKEVSRVLCKRC 244 



RESULT 11 
Q810X1 
ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RL 
DR 
DR 
DR 



PRELIMINARY; 



PRT; 



54 AA. 



Created) 

Last sequence update) 
Last annotation update) 



Craniata; Vertebrata; Euteleostomi; 
Sciurognathi; Muridae; Murinae; Mus , 



Q810X1 
Q810X1; 

01-JUN-2003 (TrEMBLrel. 24, 
01-JUN-2003 (TrEMBLrel. 24, 
01-OCT-2003 (TrEMBLrel. 25, 
Neuregulin 2-beta (Fragment) 
Mus musculus (Mouse) . 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
NCBI_TaxID=10090; 
[1] 

SEQUENCE FROM N.A. 

STRAIN=CD-1; TI SSUE=01f actory bulb; 
Mautino B., Dalla Costa L., Dati C; 

"Bioactive recombinant NRG1, NRG2 and NRG3 expressed in E. coli "* 
Submitted (JAN-2003) to the EMBL/ GenBank/DDB J databases. * ' 

EMBL; AY227026; AA072523.1; 
InterPro; IPR006209; EGF_like. 
Pfam; PF00008; EGF; 1. ~ 



DR 
DR 
FT 
FT 
SQ 



PROSITE; PS00022; EGF__1; 1. 

PROSITE; PS01186; EGF 2; 1. 

NON_TER 1 i 

NON_TER 54 54 

SEQUENCE 54 AA; 6019 MW; 



Query Match 11.4%; 
Best Local Similarity 89.5%; 
Matches 34; Conservative 



C25AA17A4D0BA59A CRC64; 



Score 200; DB 11; 
Pred. No. 2.3e-10; 
1; Mismatches 3; 



Qy 



Db 



Length 54; 
Indels 0; 



Gap 



252 KCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRC 289 
N M I I I I I I I | | | | | | | | | | | | | | | | | , | | . | || 

1 KCNETAKSYCVNGGVCYYIEGINQLSCKCPVGYTGDRC 38 



AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RA 
RA 
RA 
RA 
RA 

RA 

RA 

RA 

RT 

RL 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

KW 

FT 

SQ 



PRELIMINARY; 



PRT; 



167 AA, 



RESULT 12 
Q8NFN2 
ID Q8NFN2 
Q8NFN2; 

01-OCT-2002 (TrEMBLrel . 22, Created) 
01-OCT-2002 (TrEMBLrel . 22, Last sequence update) 
01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 
Neuregulin 1 isoform GGF (Fragment) 
NRG1. 

Homo sapiens (Human) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
[1] 

SEQUENCE FROM N.A. 

Stefansson H Sigurdsson E., Steinthorsdottir V., Bjornsdottir s., 
Srgmundsson T., Ghosh S., Brynj olf sson J., Gunnarsdottir S 
Ivarsson O., Chou T.T., Hjaltason 0., Birgisdottir B., Jonsson H., 
Gudnadott lr V.G., Gudmundsdottir E., Bj ornsson A. , Ingvarsson B., 
Ingason A., Srgfusson S., Hardardottir H w Harvey R.P., Brunner D 
Mutel V., Gonzalo A., Lemke G. , Sainz J., JohannLson G . 
Andresson T Gudbjartsson D., Manolescu A. , Frigge M.L., GurneyM.E 
Kong A., Gulcher J.R., Petursson H., Stefansson K. ; 
"Neuregulin 1 and susceptibility to Schizophrenia .» ; 
Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases 
EMBL; AF491780; AAM71139.1; 
InterPro; IPR003599; Ig. 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig__c2. 



Pfam; PF00047; ig; 1, 
SMART; SM00409; IG; 1. 
SMART; SM00408; IGc2; 1. 
PROSITE; PS50835; IG__LIKE; 1. 
Immunoglobulin domain. 
NON_TER 167 167 

SEQUENCE 167 AA; 17983 MW; 



9C2FB3A579325FF4 CRC64; 



Query Match 10.3%; Score 180.5; DB 4; 

Best Local Similarity 30.9%; Pred. No. 5.6e-08; 
Matches 51; Conservative 30; 



Mismatches 63; Indels 



Length 167; 

2 1 ; Gap s 



Qy 



126 GKNLKKEVGKILCTDCAT- 



-RPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRW 



I I 



Db 


11 


Qy 


179 


Db 


71 


Qy 


236 


Db 


129 



INI : : I : I : I : I I : ' I 



I: M I I 



RESULT 13 
Q86GD6 
ID Q86GD6 
Q86GD6; 
01-JUN-2003 
01-JUN-2003 
01-OCT-2003 
Projectin. 
PRO J. 

Procambarus 



AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RT 
RL 
DR 
DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 



PRELIMINARY; 

(TrEMBLrel. 24, 
(TrEMBLrel. 24, 
(TrEMBLrel. 25, 



PRT; 8625 AA. 
Created) 

Last sequence update) 
Last annotation update) 



clarkii (Red swamp crayfish) . 
Eukaryota; Metazoa; Arthropoda; Crustacea; Malacos traca; 
Eumalacostraca; Eucarida; Decapoda; Pleocyemata; Astacidea; 
Astacoidea; Cambaridae; Procambarus . 
NCBI_TaxID=6728; 
[1] 

SEQUENCE FROM N.A. 
TISSUE=Muscle; 

Oshino T., Shimamura J., Fukuzawa A., Maruyama K. , Kimura S • 

The entire cDNA sequences of projectin isoforms of crayfish claw 
closer and flexor muscles and their localization."; 
J. Muscle Res, Cell. Motil . 0:0-0(2003). 
EMBL; AB055927; BAC6614 0.1; 

C: mitochondrial inner membrane; IEA. 
F: ATP binding; IEA. 
Frbinding; IEA. 

Frprotein serine/threonine kinase activity; IEA. 
Frprotein-tyrosine kinase activity; IEA. 
F: translation initiation factor activity; IEA. 
Prprotein amino acid phosphorylation; IEA. 
P: translational initiation; IEA. 
P: transport; IEA. 



GO; 
GO; 
GO; 
GO; 
GO; 
GO; 
GO; 
GO; 
GO; 



GO:0005743; 
GO: 0005524; 
GO:0005488; 
GO: 0004674; 
GO:0004713; 
GO:0003743; 
GO: 0006468; 
GO:0006413; 
GO: 0006810; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 
InterPro; 



IPR003962; 
IPR003961; 
IPR008957; 
IPR003599; 
IPR007110; 
IPR003598; 
IPR003596; 
IPR001993; 
IPR000719; 
IPR002290; 
IPR008271; 
IPR001950; 
IPR001245; 



Pfam; PF00041; f n3 ; 39 



FnIII_subd. 
FN_III. 

FN_III-like- 
Ig. 

Ig-like . 

Ig_c2. 

Ig_v. 

Mitoch_carrier . 
Prot_kinase. 
Ser_thr_pkinase, 
Ser_thr_pkin_AS . 
TIF_SUI1. 
Tyr_pkinase. 



DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

SQ 



Pfam; PF00047; ig; 13. 
Pfam; PF00069; pkinase; 1. 
PRINTS; PR00014; FNTYPEIII. 
ProDom; PD000001; Proteinase; 1. 
SMART; SM00060; FN3; 39. 
SMART; SM00409; IG; 36. 
SMART; SM00408; IGc2; 24. 
SMART; SM00406; IGv; 3. 
SMART; SM0022 0; SJTKc; 1. 
SMART; SM00219; TyrKc; 1. 
PROSITE; PS50835; IG_LIKE; 24. 
PROSITE; PS00215; MITOCH__CARRIER; 2. 
PROSITE; PS00107; PROTEIN_KINASE ATP; 1 
PROSITE; PS50011; PROTEIN_KINASE DOM; 1 
PROSITE; PS00108; PROTEIN__KINASE ST; 1 
PROSITE; PS01118; SUI1__1; 1. 

SEQUENCE 8625 AA; 962637 MW; 56B8E4C4FE0AFC90 CRC64 * 



Query Match 9 . 5%; Score 166; DBS; Length 8625; 

Best Local Similarity 24.2%; Pred. No. 0.00022; 
Matches 87; Conservative 39; Mismatches 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



130; Indels 104; Gaps 16; 
2 RRDP APGFSMLLFGVSLACYSPSLKSVQDQAYKAPVVVEGKVQGLVPA 49 

8115 RRQPLTYKLWQEEGEGAPSFTFLLRPRVIQCH QTCKLLCCLAGKP VPT 8162 

50 GGSSSNST— REPPASGRVALVKVLDKWPLRSGGLQREQVISVG SCVPLER 98 

8163 VKWYKGSQELSKFDYSQSHADG-WTIEIW 8221 

99 NQRYIFF— LEPTEQPLV FKTAFAPLDTNGK NLKKEVGKILCTDCATRP 145 

8222 DRRYIETTIKDLPPPPTPAIRVDDTSSSSYFTSTHKDGR^ Q2Q1 



146 KLK- 



-KMKSQTGQV- 



I I'M " GEKQS 162 

8282 GAKRTLKPYGKRQDSTGSTSRSRSATKELELPPDDSLMGPPGFSGELPKTIAIKDGEALC 8341 

163 LKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVICV^ 222 

1 1 I I : I : I I I I M : I : I I : M I : | | : I I , 

8342 LKC-TVKGDPEPQVSWFKDGEPLSSSDIIDLKYRQGL — ASLTINEVFPEDEGLWCKAT 8398 

223 NILGKDTVRGRLYVNSVSTTLSSWSGHARKCNETAK SYCVNGGVCYYIEGINQLSCK 279 

: 1 1 : = I : : : : : I I | : | | i . , , , . 

8399 SSLGSAETKCKLSISPMEQQINGKSGRGDKLPRITQHLLSQEVPDGTAH TLSCK 8452 



AC 
DT 
DT 
DT 
DE 
GN 
OS 



PRELIMINARY; 



PRT; 5175 AA. 



RESULT 14 
Q8I0L3 
ID Q8I0L3 
Q8I0L3; 

01-MAR-2003 (TrEMBLrel . 23 r Created) 
01-MAR-2003 (TrEMBLrel . 23, Last sequence update) 
01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

FlSrJTnp Pr ° tein (^rresponding sequence F15G9.4a) 

Moby.4 OR HIM-4 . 

Caenorhabditis elegans . 



RT 
RT 



or ni; k K^^! ; MetaZOa; N ^toda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Sulston J.E. ; 

RL Submitted (DEC-1994) to the EMBL/ GenBank/DDBJ databases. 

KN [ 2. J 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed=9851916 ; 
RA none; 

"Genome sequence of the nematode C.elegans: A platform for 
investigating biology."; 
RL Science 282:2012-2018(1998) 
RN [3] 

RP SEQUENCE FROM N.A. 
RA Kershaw J. K. ; 

RL Submitted (DEC-1994) to the EMBL/ GenBank/DDBJ databases 
DR EMBL; 247068; CAA87335.1; 
DR EMBL; Z47070; CAA87335.1; JOINED. 
DR EMBL; Z47070; CAA87344.1; -. 
DR EMBL; Z47068; CAA87344.1; JOINED. 
DR PIR; T20992; T20992. 
DR WormPep; F15G9.4a; CE18595. 
DR GO; GO: 0016020; C:membrane; IEA. 

GO; GO: 0005509; F: calcium ion binding; IEA. 
GO; GO: 0005215; F: transporter activity; IEA 
GO; GO: 0006810; P: transport; IEA. 
InterPro; IPR000152; Asx_hydroxyl JS . 
DR InterPro; IPR000515; BPD_transp . 
DR InterPro; IPR001881; EGF_Ca . 
DR InterPro; IPR006209; EGF_like. 
DR InterPro; IPR006210; IEGF. 
InterPro; IPR003599; Ig. 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
InterPro; IPR003596; Ig_v. 
DR Pfam; PF00047; ig; 47. 
DR SMART; SM00181; EGF; 2. 
DR SMART; SM00179; EGF_CA; 2. 
DR SMART; SM00409; IG; 45. 
DR SMART; SM004 08; IGc2; 47. 
DR SMART; SM004 06; IGv; 12. 
DR PROSITE; PS00010; ASX_HYDROXYL; 1. 
DR PROSITE; PS00402; BPD_TRANSP INN MEMBR; 1 
DR PROSITE; PS01186; EGF_2; 1. 
DR PROSITE; PS01187; EGF_CA; 2. 
DR PROSITE; PS50835; IG_LIKE; 47. 

SQ SEQUENCE 5175 AA; 568471 MW; 4B2561803BBC62A4 CRC64 ; 

Query Match 9 . 4 % ; Score 164.5; DB 5; Length 5175; 

Best Local Similarity 26.5%; Pred. No. 0.00015; 

Matches 62; Conservative 26; Mismatches 79; Indels 67; Gaps 10; 

Q y 49 AGGSSSNSTR E P PAS G RVAL VKVL D K wpLRS 79 

Db 595 AGGMSTRKMRLDIMEPPS vi^PQDVYFNMREGVNLSCEAMGDPKPEVHiYFKG 648 



DR 
DR 
DR 
DR 



DR 
DR 
DR 
DR 



QY 8 0 GGLQREQVISVGSCVPLERNQRYIFFL EPTEQPLVFKTAFAPLDTNGKNLKKEV 133 

I : II :::::: I I : | | | | 
Db 64 9 RHLLNDYKYQVG QDSKFLYIRDATHHDEGTYECRAMSQAGQARDTTDLML 698 

Qy 134 GKILCTDCATRPKLK— KMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDI 191 
Db 699 ATPPKVEIIQNK^GR-GDRVSFECKTIRGKPHPKIRlIfFKNGiDLIKPDDY 749 



Qy 192 RI KYGNGRKNS RLQFNKVKVEDAGEYVCEAENI LGKDTVRGRLYVNSVS TTL S S 245 

: I I I I I I I I I I : | | I || 



Db 750 -IKINEG QLHIMGAKDEDAGAYSCVGENMAGKDVQYANLSVGRVPTI 



IES 798 



RESULT 15 
076518 

ID 076518 PRELIMINARY; PRT; 5198 AA 

AC 076518; Q10036; 

DT Ol-NOV-1998 (TrEMBLrel . 08, Created) 

DT Ol-NOV-1998 (TrEMBLrel. 08, Last sequence update) 

DT Ol-OCT-2003 (TrEMBLrel . 25, Last annotation update) 

de nssrssr (c - elegans him_4 prote±n) p^^ g 

GN F15G9.4 OR HIM-4. 

OS Caenorhabditis elegans. 

OC ^ab^ Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

Rnabditidae; Pelodermae; Caenorhabditis. 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Bristol N2; 

RX PubMed=11222143; 

RA Vogel B.E., Hedgecock E.M.; 

l^r^^' 3 COnserved extracellular member of the immunoglobulin 

or^t Ti Y ' °T ni r S e P ±thelial ^d other cell attachments into 
oriented line-shaped junctions."; 

RL Development 128:883-894(2001) 
RN [2] 

RP SEQUENCE FROM N.A. 

RA Sulston J.E. ; 

RL * Submitted (DEC-1994) to the EMBL/ GenBank/DDBJ databases. 
RN [ 3 ] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed=9851916 ; 

RA none; 

"Genome sequence of the nematode C. elegans: A platform for 
investigating biology."; 

RL Science 282:2012-2 018(1998). 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Kershaw J.K. ; 

no ^ ±tted (DEC>1994 > to EMBL/ GenBank/ DDB J databases. 

DR EMBL; AF074901; AAC26792.1; -. 

DR EMBL; Z47068; CAA87336.1; -. 

DR EMBL; Z47070; CAA87336.1; JOINED. 

DR EMBL; Z47070; CAA87345.1; -. 

DR EMBL; Z47068; CAA87345.1; JOINED. 



RT 
RT 
RT 



RT 
RT 



DR 
DR 
DR 
DR 
DR 



DR PIR; T43290; T43290. 
DR HSSP; P00736; 1APQ. 
DR WormPep; F15G9.4b; CE18596. 

GO; GO: 0016020; C:membrane; IEA. 
GO; GO: 0005509; F: calcium ion binding; IEA 
GO; GO: 0005215; F: transporter activity; IEA 
GO; GO:0006810; P:transport; IEA. 
InterPro; IPR000152; Asx_hydroxyl S. 
DR InterPro; IPR000515; BPD_transp. ~ 
DR InterPro; IPR001881; EGF_Ca. 
DR InterPro; IPR006209; EGF_like. 
DR InterPro; IPR007110; Ig-like. 
DR InterPro; IPR003598; Ig_c2 . 
DR InterPro; IPR002035; VWF A. 
DR Pfam; PF00047; ig; 47. ~ 
DR SMART; SM00179; EGF_CA; 1. 
DR SMART; SM00408; IGc2; 44. 
DR SMART; SM00327; VWA; 1. 
DR PROSITE; PS00010; ASX_HYDROXYL; 1 
DR PROSITE; PS00402; BPD_TRANSP INN MEMBR; 1 
DR PROSITE; PS01186; EGF_2 ; 1. ~~ 
DR PROSITE; PS01187; EGF_CA; 2. 
DR PROSITE; PS50835; IG_LIKE; 47. 

KW EGF-like domain; Immunoglobulin domain; Signal 
FT SIGNAL 1 24 POTENTIAL. 

FT CHAIN 25 5198 HEMICENTIN. 

SQ SEQUENCE 5198 AA; 570809 MW; DA8511FF2B58D37B CRC64; 

Query Match 9 . 4%; Score 164.5; DBS; Length 5198; 

Best Local Similarity 26.5%; Pred. No. 0.00015- 

Matches 62; Conservative 26; Mismatches 79; Indels 67; Gaps 10; 

AGGSSSNSTR E PPAS GRVALVKVLDK WPTPQ 7Q 

llll: I Ml: Ml : LRS ?9 

AGGMSTRKMRLDIMEPPS VKVTPQDVYFNMREGVNLSCEAMGDPKPEVHWYFKG 648 

GGLQREQVI SVGSCVPLERNQRYI FFL EPTEQPLVFKTAFAPLDTNGKNLKKEV 133 

I : II :::::: | | : | | | , 
RHLLNDYKYQVG QDSKFLYI RDATHHDEGT YECRAMSQAGQARDTTDLML 698 

GKILCTDCATRPKLK-KMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDI 191 
Nil:: : I I : I : : | : | : | | | I | I I : I I : I : I 

ATPPKVEIIQNKMMVGR— GDRVSFECKTIRGKPHPKIRWFKNGKDLIKPDDY 749 

RIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTTLSS 245 
11 1 =1 I I I I I I I | |: | | | I I I I : | 



Qy 


49 


Db 


595 


Qy 


80 


Db 


649 


Qy 


134 


Db 


699 


Qy 


192 


Db 


750 



Search completed: August 17, 2004, 14:12:40 
Job time : 39.3089 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

August 17, 2004, 13:56:40 ; Search time 9.4586 Seconds 

(without alignments) 
1816.670 Million cell updates/sec 

US-09-864-675-2 
1749 

1 MRRDPAPGFSMLLFGVSLAC PGTGVSSSQWSTSPSTLDLN 330 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 

141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 141681 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : SwissProt_42 : * 

Pred, No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 
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1610 


92, 


.1 


850 


1 


NRG2 


2 


1576 


90, 


.1 


868 


1 


NRG2 


3 


1475 


84. 


,3 


756 


1 


NRG2 


4 


311.5 


17. 


,8 


677 


1 


NRG1 


5 


293.5 


16. 


, 8 


639 


1 


NRG1 


6 


284.5 


16. 


.3 


602 


1 


NRG1 


7 


283 


16. 


2 


662 


1 


NRG1 


8 


216.5 


12. 


4 


623 


1 


VEIN 


9 


155 


8. 


9 


298 


1 


JAM2" 


10 


145 


8. 


3 


338 


1 


lamp" 


11 


144 


8. 


2 


338 


1 


lamp" 


12 


141.5 


8. 


1 


353 


1 


CEPU 


13 


140 


8. 


0 


761 


1 


NCA2~ 


14 


140 


8. 


0 


848 


1 


NCAl~ 


15 


139.5 


8. 


0 


6632 


1 


UN89~ 


16 


139 


7. 


9 


2012 


1 


DSCA~ 


17 


135.5 


7. 


7 


1356 


1 


VGR2~ 



HUMAN 014511 homo sapien 

_RAT 035569 rattus norv 

_MOUSE P56974 mus musculu 

XENLA 093383 xenopus lae 

_HUMAN Q02297 h pro-neure 

_CHICK Q05199 gallus gall 

_RAT P43322 r pro-neure 

DROME Q94918 drosophila 

HUMAN P57087 homo sapien 

.RAT Q62813 rattus norv 

HUMAN Q1344 9 homo sapien 

CHICK Q90773 gallus gall 

HUMAN P13592 homo sapien 

HUMAN P13591 homo sapien 

CAE EL 001761 caenorhabdi 

HUMAN 060469 homo sapien 

HUMAN P35968 homo sapien 



18 


135 


7 


.7 


4391 


1 


PGBM HUMAN 


P98160 


homo sapien 


19 


133 --5 


7 


. 6 


853 


1 


NCA1_B0VIN 


P31836 


bos taurus 


20 


133. 5 


7 


. 6 


1367 


1 


VGR2_MOUSE 


P35918 


mus musculu 


21 


131.5 


7 


.5 


338 


1 


LAMP CHICK 


Q98919 


gallus gall 


22 


131.5 


7 


.5 


1343 


1 


VGR2_RAT 


008775 


rattus norv 


23 


131 


7 


.5 


296 


1 


SMDF_HUMAN 


Q15491 


homo sapien 


24 


131 


7, 


.5 


345 


1 


OPCM__RAT 


P32736 


rattus norv 


25 


131 


7, 


.5 


824 


1 


MLT1_HUMAN 


Q9udy8 


homo sapien 


26 


130. 5 


7, 


.5 


837 


1 


NCM2_MOUSE 


035136 


mus musculu 


27 


130. 5 


7, 


.5 


1040 


1 


AXOl^RAT 


P22063 


rattus norv 


28 


129. 5 


1, 


.4 


1018 


1 


CONT__HUMAN 


Q12860 


homo <?ani f^n 


29 


129. 5 


7. 


,4 


1091 


1 


NCA1_CHICK 


P13590 


gallus gall 


30 


129 


7, 


.4 


345 


1 


OPCM__HUMAN 


Q14982 


homo sarjipn 


31 


129 


7. 


,4 


1036 


1 


AX01__CHICK 


P28685 


aallus aa 1 1 


32 


128. 5 


7. 


,3 


1010 


1 


CONT_CHICK 


P14781 


aallu 1 1 

*H W ^* -L- _L 


33 


128. 5 


7. 


3 


1020 


1 


CONT__MOUSE 


P12960 


mus Tnn^pul ii 

illUkJ 11LLX >~J U.J. LI 


34 


128. 5 


7. 


3 


1021 


1 


CONT_RAT 


Q63198 


rattus norv 


35 


128 


7. 


3 


1040 


1 


AX01_HUMAN 


Q02246 


homo ^ ^ r~> i p> ri 


36 


127 


7. 


3 


298 


1 


JAM1_B0VIN 


09xt56 


i-r\J£3 Lu Ul UiJ 


37 


127 


7. 


3 


344 


1 


NTRI_RAT 


Q62718 


rattus norv 


38 


126 


7. 


2 


1217 


1 


EGF_MOUSE 


P01132 


mus Tnn^riil n 


39 


124 


7. 


1 


344 


1 


NTRI MOUSE 


Q99pj 0 


mus miiscul \i 

XLLLXhJ V_-LX_L KA. 


40 


123. 5 


7. 


1 


858 


1 


NCA1_RAT 


P13596 


rattus norv 


41 


123 


7. 


0 


345 


1 


OPCM_BOVIN 


P11834 


bos taurus 


42 


122 


7. 


0 


344 


1 


NTRI_HUMAN 


Q9pl21 


homo sapien 


43 


121. 5 


6. 


9 


725 


1 


NCA2_MOUSE 


P13594 


mus musculu 


44 


121. 5 


6. 


9 


837 


1 


NCM2_HUMAN 


015394 


homo sapien 


45 


121. 5 


6. 


9 


1115 


1 


NCAl_MOUSE 


P13595 


mus musculu 



ALIGNMENTS 



RESULT 1 
NRG2_HUMAN 

ID NRG2_HUMAN STANDARD; PRT; 850 AA. 

AC 014511; 

DT 15-DEC-1998 (Rel. 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Pro-neuregulin-2 precursor (Pro-NRG2) [Contains: Neuregulin-2 (NRG-2) 

DE (Neural-and thymus -de rived activator for ERBB kinases) (NTAK) 

DE (Divergent of neuregulin 1) (DON-1) ] . 

GN NRG2 OR NTAK. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI__TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RC TISSUE=Neuroblastoma; 

RX MEDLINE=98006324; PubMed=9348101; 

RA Higashiyama S., Horikawa M. , Yamada K., Ichino N. , Nakano N . , 

RA Nakagawa T . , Miyagawa J., Matsushita N., Nagatsu T . , Taniguchi N., 

RA Ishiguro H. ; 

RT "A novel brain-derived member of the epidermal growth factor family 

RT that interacts with ErbB3 and ErbB4."; 



RL J. Biochem. 122:675-680(1997). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS DON-lB AND DON-1R) . 

RC TISSUE=Fetal brain; 

RX MEDLINE=97342638; PubMed=91 99335 ; 

RA Busfield S.J., Michnick D.A., Checkering T.W., Revett T.L., Ma J., 

RA Woolf E.A., Comrack C.A., Dussault B.J., Woolf J. , Goodearl A.D.J., 

RA Gearing D. P. ; 

RT "Characterization of a neuregulin-related gene, Don-1, that is highly 

RT expressed in restricted regions of the cerebellum and hippocampus .» ; 

RL Mol. Cell. Biol. 17:4 007-4014(1997). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORMS 1; 2; 3; 4; 5 AND 6) . 

RC TISSUE=Fetal brain, and Lung; 

RX MEDLINE=99295836; PubMed=10369162 ; 

RA Ring H.Z., Chang H., Guilbot A., Brice A., LeGuern E., Francke U. ; 

RT "The human neuregulin 2 (NRG2) gene: cloning, mapping and evaluation 

RT as a candidate for the autosomal recessive form of Charcot-Marie-Tooth 

RT disease linked to 5q."; 

RL Hum. Genet. 104:326-332(1999). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBB1 and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. May also promote the 

CC heterodimerization with the EGF receptor. 

CC -!- SUBCELLULAR LOCATION: EXISTS AS AN TYPE I MEMBRANE PROTEIN AND AS 

CC A PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE 

CC MEMBRANE- BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=8; 

CC Name=l ; 

cc Isold=014511-1; Sequence=Displayed; 

CC Name=2 ; 

cc IsoId=014511-2; Sequence=VSP_003453 ; 

CC Name=3 ; 

CC IsoId=Ol4511-3; Sequence=VSP_003455; 

CC Name=4 ; 

CC IsoId=014511-4; Sequence=VSP_003454 ; 

CC Name=5 ; 

cc IsoId=014511-5; Sequence=VSP_003458, VSP 003459; 

CC Name=6; ~* 

cc IsoId=014511-6; Sequence=VSP_003456, VSP 003457; 

CC Name=DON- IB; 

C C IsoId=014511-7; Sequence=VSP_003452 , VSP 003455; 

CC Name=DON- 1 R ; 

CC IsoId=014511-8; Sequence=VSP_0034 51 ; 

CC -!- TISSUE SPECIFICITY: Restricted to the cerebellum in the adult. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 

cc of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain (By similarity) . 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form (By similarity) . 

CC -!- PTM: Extensive glycos ylation precedes the proteolytic cleavage (By 



cc 


similarity) . 




cc 


-!- SIMILARITY: Contains 1 EGF-like domain. 


cc 


-!- SIMILARITY: Contains 1 


. immunoglobulin-like C2-type domain. 


cc 
cc 


-!- SIMILARITY: Belongs to the neuregulin family. 


cc 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


cc 


between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 


cc 


the European Bioinf ormatics Institute. There are no restrictions on its 


cc 


use by non-profit institutions as long as its content is in no way 


cc 


modified and this statement is not removed. Usage by and for commercial 


cc 


entities requires a license agreement (See http://www.isb-sib.ch/announce/ 


cc 
cc 


or send an email to license@isb-sib . ch) . 


DR 


EMBL; AB005060; BAA23417.1 


; - . 


DR 


EMBL; AF119162; AAF28848.1 


; 


DR 


EMBL; AF119151; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119152; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119153; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119154; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119155; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119158; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119159; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119160; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119161; AAF28848.1 


; JOINED. 


DR 


EMBL; AF119162; AAF28849.1 


; - . 


DR 


EMBL; AF119151; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119152; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119153; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119154; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119156; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119158; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119159; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119160; AAF28849.1 


; JOINED. 


DR 


EMBL; AF119161; AAF28849.1, 


; JOINED. 


DR 


EMBL; AF119162; AAF28850.1, 


r -. 


DR 


EMBL; AF119151; AAF28850.1, 


■ JOINED. 


DR 


EMBL; AF119152; AAF28850.1, 


' JOINED. 


DR 


EMBL; AF119153; AAF28850.1, 


JOINED. 


DR 


EMBL; AF119154; AAF28850.1, 


JOINED. 


DR 


EMBL; AF119155; AAF28850.1; 


JOINED. 


DR 


EMBL; AF119157; AAF28850.1; 


JOINED. 


DR 


EMBL; AF119158; AAF28850.1; 


JOINED. 


DR 


EMBL; AF119159; AAF28850.1; 


JOINED. 


DR 


EMBL; AF119160; AAF28850.1; 


JOINED. 


DR 


EMBL; AF119161; AAF28850.1; 


JOINED. 


DR 


EMBL; AF119162; AAF28851.1; 




DR 


EMBL; AF119151; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119152; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119153; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119154; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119156; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119157; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119158; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119159; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119160; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119161; AAF28851.1; 


JOINED. 


DR 


EMBL; AF119158; AAF28852.1; 





DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

KW 

KW 



EMBL; AF119151; AAF28852.1; JOINED. 
EMBL; AF119152; AAF28852.1; JOINED. 
EMBL; AF119153; AAF28852.1; JOINED. 
EMBL; AF119154; AAF28852.1; JOINED. 
EMBL; AF119155; AAF28852.1; JOINED. 
EMBL; AF119156; AAF28852.1; JOINED. 
EMBL; AF119157; AAF28853.1; 
EMBL; AF119151; AAF28853.1; JOINED. 
EMBL; AF119152; AAF28853.1; JOINED. 
EMBL; AF119153; AAF28853.1; JOINED. 
EMBL; AF119154; AAF28853.1; JOINED. 
EMBL; AF119155; AAF28853.1; JOINED. 
EMBL; AF119156; AAF28853.1; JOINED. 
PIR; JC5700; JC5700. 
HSSP; Q12784; 1HRE. 
Genew; HGNC:7998; NRG2 . 
MIM; 603818; -. 

GO; GO: 0005102; F: receptor binding; TAS . 

GO; GO:0007165; Prsignal transduction; TAS. 

InterPro; IPR006209; EGF_like. 

InterPro; IPR006210; IEGF. 

InterPro; IPR007110; Ig-like. 

InterPro; IPR003598; Ig_c2. 

InterPro; IPR002154; Neuregulin. 

Pfam; PF00008; EGF; 1. 

Pfam; PF00047; ig; 1. 

Pfam; PF02158; Neuregulin; 1. 

SMART; SM00181; EGF; 1. 

SMART; SM00408; IGc2; 1. 

PROSITE; PS00022; EGF_1; 1. 

PROSITE; PS01186; EGF_2; 1. 

PROSITE; PS50026; EGF_3; 1. 

PROSITE; PS50835; IG_LIKE; 1. 

Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 



FT 


PROPEP 


1 


111 


BY SIMILARITY. 




FT 


CHAIN 


112 


850 


PRO-NEUREGULIN-2 , 


MEMBRANE- BOUND FORM 


FT 


CHAIN 


112 


404 


NEUREGULIN- 2 . 




FT 


DOMAIN 


112 


405 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


406 


426 


INTERNAL SIGNAL SEQUENCE (POTENTIAL). 


FT 


DOMAIN 


427 


850 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


237 


332 


IG-LIKE C2-TYPE . 




FT 


DOMAIN 


330 


340 


SER/THR-RICH. 




FT 


DOMAIN 


341 


382 


EGF-LIKE. 




FT 


DOMAIN 


10 


13 


POLY- PRO. 




FT 


DOMAIN 


20 


30 


POLY-SER. 




FT 


DOMAIN 


33 


47 


POLY-SER. 




FT 


DOMAIN 


87 


90 


POLY-ALA. 




FT 


DOMAIN 


721 


727 


POLY-PRO. 




FT 


DISULFID 


257 


311 


BY SIMILARITY . 




FT 


DISULFID 


345 


359 


BY SIMILARITY. 




FT 


DISULFID 


353 


370 


BY SIMILARITY. 




FT 


DISULFID 


372 


381 


BY SIMILARITY. 




FT 


CARBOHYD 


52 


52 


N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


FT 


CARBOHYD 


53 


53 


N-LINKED (GLCNAC. 


. .) (POTENTIAL) . 


FT 


CARBOHYD 


147 


147 


N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


FT 


CARBOHYD 


278 


278 


N-LINKED (GLCNAC. 


. .) (POTENTIAL). 



FT CARBOHYD 346 346 N-LINKED (GLCNAC. . .) (POTENTIAL) 



FT VARSPLIC 1 233 



MRQVCCSALPPPPLEKGRCSSYSDSSSSSSERSSSSSSSSS 



FT ESGSSSRSSSNNSSISRPAAPPEPRPQQQPQPRSPAARRAA 
FT 

FT 

FT 



ARS RAAAAGGMRRDPAPGFSMLLFGVS LAC YS P S LKS VQDQ 
AYKAPVWE G KVQ GLVP AGG S SSNSTREPPAS GRVAL VKVL 
DKWPLRSGGLQREQVI SVGS CVPLERNQRYI FFLEPTEQPL 
FT VFKTAFAPLDTNGKNLKKEVGKI LCTDC -> MSESRRRGR 

Query Match 92.1%; Score 1610; DB1; Length 850; 

Best Local Similarity 100.0%; Pred. No. 1.7e-121; 

Matches 304; Conservative 0; Mismatches 0; Indels 0; Gaps 0 
MRRDPAPGFSMLLFGVS LACYS PSLKS VQDQAYKAPVVVEGKVQGLVPAGGS S SNSTREP 60 

'''It I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 152 
PAS GRVAL VKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 12 0 

I I I M M I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

PAS GRVALVTCVLDKWPLRSGGLQREQVI SVGS CVPLERNQRYI FFLEPTEQPLVFKTAFA 212 

PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPS YRWFK 18 0 
I I I I M I I I I I I I I I I I M I I I I II I I I || | | | || | | | | | | M | | | M I 



I N I I I I I M I I I I I I I I I I I I I I | | I I I I I I I M l | | | | | | | | | | || | |' 



TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 
I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I 1 II I I I || I | | | | | || | | | | | 
TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 392 



I I I I 



Qy 


1 


Db 


93 


Qy 


61 


Db 


153 


Qy 


121 


Db 


213 


Qy 


181 


Db 


273 


Qy 


241 


Db 


333 


Qy 


301 


Db 


393 



RESULT 2 
NRG2_RAT 

ID NRG2_RAT STANDARD; PRT; 8 68 AA. 

AC 035569; 035073; 035570; 035571; 035572; 

DT 15-DEC-1998 (Rel. 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-2 precursor (Pro-NRG2) [Contains: Neuregulin-2 (NRG-2) 

DE (Neural-and thymus-derived activator for ERBB kinases) (NTAK) 1 

GN NRG2 OR NTAK. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A., SEQUENCE OF 128-162, AND ALTERNATIVE SPLICING 

RX MEDLINE-98006324; PubMed=9348 101 ; 

RA Higashiyama S., Horikawa M. , Yamada K. , Ichino N., Nakano N. , 

RA Nakagawa T . , Miyagawa J., Matsushita N., Nagatsu T., Taniguchi N., 

RA Ishiguro H. ; 



RT "A novel brain-derived member of the epidermal growth factor family 

RT that interacts with ErbB3 and ErbB4."; 

RL J. Biochem. 122:675-680(1997). 

RN [2] 

RP SEQUENCE OF 109-868 FROM N.A. (ISOFORMS 6 AND 7) . 

RC TISSUE=Cerebellum; 

RX MEDLINE=97311397; PubMed=9168114 ; 

RA Chang H., Riese D.J. II, Gilbert W., Stern D.F., McMahan U.J.; 

RT "Ligands for ErbB-family receptors encoded by a neuregulin-like 

RT gene."; 

RL Nature 387:509-512(1997). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBB1 and ERBB2 coreceptors, 

cc resulting in ligand-stimulated tyrosine phosphorylation and 

cc activation of the ERBB receptors. May also promote the 

CC heterodimerization with the EGF receptor. 

CC -!- SUBCELLULAR LOCATION: EXISTS AS AN TYPE I MEMBRANE PROTEIN AND AS 

CC A PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE 

CC MEMBRANE- BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=7; 

cc Comment=Additional isoforms seem to exist. The alpha-type and 

cc beta-type differ in the EGF-LIKE domain; 

CC Name=l; Synonyms=NTAK-alphal ; 

CC Isold=0355 69-1; Sequence=Displayed; 

CC Name=2; Synonyms =NTAK- alpha 2A; 

CC IsoId=035569-2; Sequence=VSP_003471 ; 

CC Name=3; Synonyms=NTAK-alpha2B, NTAK-alpha2-lP; 

CC IsoId=035569-3; Sequence=VSP_0034 66, VSP_003471; 

CC Name=4; Synonyms=NTAK-beta; 

CC IsoId=035569-4 ; Sequence=VSP_003470 ; 

CC Name=5; Synonyms =NTAK- gamma; 

CC IsoId=035569-5; Sequence=VSP_0034 67 , VSP_003468; 

CC Name=6; Synonyms=NRG2-alpha ; 

CC IsoId=035569-6; Sequence=VSP_003472 , VSP_003473; 

CC Name=7 ; Synonyms=NRG2-beta; 

CC IsoId=035569-7; Sequence=VSP_0034 65, VSP_003469; 

CC -!- TISSUE SPECIFICITY: Expressed in most parts of the brain, 

CC especially the olfactory bulb and cerebellum where it localizes in 

CC granule and Purkinje cells. In the hippocampus, found in the 

CC granule cells of the dentrate gyrus. In the basal forebrain, found 

CC in the cholinergic cells. In the hindbrain, weakly detectable in 

CC the motor trigeminal nucleus. Not detected in the hypothalamus. 

CC Also found in the liver and in the thymus. Not detected in heart, 

CC adrenal gland, or testis. 

CC -!- DEVELOPMENTAL STAGE: In the embryo, expressed in the brain of 

CC ell. 5 embryos where it is found in the telencephalon, but not in 

CC the hindbrain. Not found in the heart. In the adult, found in 

CC brain and thymus. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 

CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain (By similarity) . 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 
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form (By similarity) . 

PTM: Extensive glycosylation precedes the proteolytic cleavage (By 
similarity) . 

SIMILARITY: Contains 1 EGF-like domain. 
-!- SIMILARITY : Contains 1 immunoglobulin-like C2-type domain. 
-!- SIMILARITY: Belongs to the neuregulin family. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib , ch) . 

EMBL; D89995; BAA23344.1j 
EMBL; D89996; BAA23345.1, 
EMBL; D89997; BAA23346.1; 
EMBL; D89998; BAA23347.1, 
EMBL; AB001576; BAA23348.1; 
PIR; JC5701; JC5701. 
PIR; JC5702; JC5702. 
HSSP; Q12784; 1HRE. 
InterPro; IPR006209; EGF-like. 
InterPro; IPR006210; IEGF. 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
InterPro; IPR002154; Neuregulin. 
Pfam; PF00008; EGF; 1. 
Pfam; PF00047; ig; 1. 
Pfam; PF02158; Neuregulin; 1. 
SMART; SM00181; EGF; 1. 
SMART; SM00408; IGc2; 1. 
PROSITE; PS00022; EGF_1; 1. 
PROSITE; PS01186; EGF_2; 1. 
PROSITE; PS50026; EGF_3; 1. 
PROSITE; PS50835; IG_LIKE; 1. 

Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 
Transmembrane; Multigene family; Alternative splicing. 
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PRO-NEUREGULIN-2, MEMBRANE- BOUND FORM. 
NEUREGULIN-2 . 

EXTRACELLULAR (POTENTIAL) . 

INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 

CYTOPLASMIC (POTENTIAL) . 

IG-LIKE C2-TYPE. 

SER/THR-RICH. 

EGF-LIKE. 

POLY-SER. 

POLY-SER. 

POLY-THR. 

POLY- ALA. 

POLY- PRO. 

BY SIMILARITY. 

BY SIMILARITY. 

BY SIMILARITY. 

BY SIMILARITY. 
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SEQUENCE 


868 AA; 


937 


Query Match 




90 



7) 



>) . 

5) 



(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 



-> VGYTGDRCQQFAMV 



N-LINKED ( GLCNAC . 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. . .) 
N-LINKED (GLCNAC. . .) 
N-LINKED (GLCNAC. 
Missing (in isoform 
/FTId=VSP_003465. 
PLV -> FFF (in isoform 3) 
/FTId=VSP_003466. 
C -> G (in isoform 5 
/FTId=VSP__003467. 
Missing (in isoform 
/FTId=VSP_003468. 
NGFFGQRCLEKLPLRLYMPDPKQ 
NFS (in isoform 7) . 
/FTId=VSP__003469. 

NGFFGQRCLEKLPLRLYMPDPKQ KHLGFELKE -> VGYTG 
DRCQQFAMVNFSK (in isoform 4) . 
/FTId=VSP__00347 0. 

Missing (in isoform 2 and isoform 3) . 
/FTId=VSP_003471. 

HLGFELKEAEELYQKRVLTITGICVA -> SVLWDTPGTGV 
SSSQWSTSPSTLDLN (in isoform 6) . 
/FTId=VSP_003472. 
Missing (in isoform 6) . 
/FTId=VSP_003473. 
S -> F (IN REF. 2) . 
R -> H (IN REF. 2) . 
MW; 3C7D4D94DBE64DE2 CRC64; 



Best Local Similarity 96.7%; 
Matches 297; Conservative 



Score 1576; DB 1; Length 868; 
Pred. No. 9.5e-119; 
4; Mismatches 6; Indels 0; 



Gaps 



0; 
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MRRDPAPGFSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREP 60 

11 1 1 1 1 > I I I I I I I I I I I I I M I I I I I I I I I I I I I I | | | | | | | | | | | || 

MRRDPAPGS SMLLFGVSLACYS PS LKS VQDQAYKAPVWEGKVQGLAPAGGS S SNSTREP 168 



PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCVPLERNQRYIFFLEPTEQPLVFKTAFA 

I MM IN II MM III I II III I II I II II M I II MM I III II I III 

PASGRVALVKVLDKWPLRSGGLQREQVI SVGSCAPLERNQRYIFFLEPTEQPLVFKTAFA 



120 



228 



PLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFK 180 
1:1 '''' : ' I I I I I I 1 I I I I M M II M I I I I II : I I | | | II II I II I M I I I I I II I I 
PVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAAAGNPQPSYRWFK 288 

DGKELNRS RDIRI KYGNGRKNS RLQFNKVKVEDAGEYVCEAENI LGKDTVRGRLYVNSVS 240 

I M I I ! M I M ; M ! I I I I M I I M I I II II M I II I I I I :||||| 

DGKELNRS RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLHVNSVS 348 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 300 

' ' ' ' I I I I I M I M I I I I I I I Ml MM Mill Ml MM II Ml III I 

TTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 408 



MM I 



RESULT 3 
NRG2__M0USE 

ID NRG2_M0USE STANDARD; PRT; 756 AA 

AC P56974; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-2 precursor (Pro-NRG2) [Contains: Neuregulin-2 (NRG-2 ) 

DE (Divergent of neuregulin 1) (DON-1)]. 

GN NRG2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS NRG2-5; NRG2-10 AND NRG2-16A) 

RC STRAIN=C57BL/6; TISSUE=Brain; 

RX MEDLINE=97311398; PubMed=9168115; 

RA Carraway K.L. Ill, Weber J.L., Unger M.J., Ledesma J., Yu N., 

RA Gassmann M. , Lai C; 

RT "Neuregulin-2, a new ligand of ErbB3/ErbB4-receptor tyrosine 

RT kinases. "; 

RL Nature 387:512-516(1997). 

RN [2] 

RP SEQUENCE OF 150-756 FROM N.A. (ISOFORMS DON-1M AND DON-IS) 

RC TISSUE=Choroid plexus; 

RX MEDLINE=97342638; PubMed=9199335; 

RA Busfield S.J., Michnick D.A., Chickering T.W., Revett T.L., Ma J., 

RA Woolf E.A., Comrack C.A. , Dussault B.J., Woolf J., Goodearl A.D j' 

RA Gearing D. P. ; ' 

RT "Characterization of a neuregulin-related gene, Don-1, that is highly 

RT expressed in restricted regions of the cerebellum and hippocampus."; 

RL Mol. Cell. Biol. 17:4007-4014(1997). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBB1 and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. May also promote the 

CC heterodimerization with the EGF receptor. 

CC SUBCELLULAR LOCATION: EXISTS AS AN TYPE I MEMBRANE PROTEIN AND AS 

CC A PROTEOLYTICALLY RELEASED SOLUBLE GROWTH FACTOR FORM. THE 

CC MEMBRANE- BOUND FORM DOES NOT SEEM TO BE ACTIVE (BY SIMILARITY) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=4; 

CC Comment=Additional isoforms seem to exist; 

CC Name=NRG2 - 1 6A; 

cc IsoId=P56974-l; Sequence=Displayed; 

CC Name=DON- 1M; 

cc IsoId=P56974-2; Sequence=VSP_0034 64 ; 

CC Name=DON-lS; Synonyms=NRG2-5 ; 

cc IsoId=P56974-3; Sequence=VSP_003462 , VSP 003463; 

CC Name=NRG2 -10; 

cc IsoId=P56974-4; Sequence=VSP_0034 60 , VSP 003461; 

CC -!- TISSUE SPECIFICITY: Highest expression in r~he brain, with lower 

CC levels m the lung. In the cerebellum, found in granule and 

CC Purkinje cells. 
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DOMAIN: The cytoplasmic domain may be involved in the regulation 
of trafficking and proteolytic processing. Regulation of the 
proteolytic processing involves initial intracellular domain 
dimerization (By similarity) . 

DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 
domain (By similarity) . 

PTM: Proteolytic cleavage close to the plasma membrane on the 
external face leads to the release of the soluble growth factor 
form (By similarity) . 

PTM: Extensive glycosylation precedes the proteolytic cleavage (By 
similarity) . 

SIMILARITY: Contains 1 EGF-like domain. 

SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 
SIMILARITY: Belongs to the neuregulin family. 
HSSP; Q12784; 1HRE. 
MGD; MGI: 1098246; Nrg2 . 
InterPro; IPR006209; EGF_like. 
InterPro; IPR006210; IEGF. 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
InterPro; IPR002154; Neuregulin. 
Pfam; PF00008; EGF; 1. 
Pfam; PF00047; ig; 1. 
Pfam; PF02158; Neuregulin; 1. 
SMART; SM00181; EGF; 1. 
SMART; SM00408; IGc2; 1. 
PROSITE; PS00022; EGF_1; 1. 
PROSITE; PS01186; EGF_2 ; 1. 
PROSITE; PS50026; EGF_3; 1. 
PROSITE; PS50835; IG_LIKE; 1. 

Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 
Transmembrane; Multigene family; Alternative splicing. 

BY SIMILARITY. 

PRO-NEUREGULIN-2, MEMBRANE-BOUND FORM. 
NEUREGULIN-2 . 

EXTRACELLULAR (POTENTIAL) . 
INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 
CYTOPLASMIC (POTENTIAL). 
IG-LIKE C2-TYPE . 
SER/THR-RICH. 
EGF-LIKE. 
POLY- PRO. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N-LINKED (GLCNAC. . .) 
N-LINKED (GLCNAC. . .) 
N-LINKED (GLCNAC. . .) 
N-LINKED (GLCNAC. . .) 
C -> G (in isoform NRG2-10) . 
/FTId=VSP_003460. 
Missing (in isoform NRG2-10) . 
/FTId=VSP_003461. 

VGYTGDRCQQFAMVNFSKHLGFELKEAEELYQKRVLTITGI 
CVALLWG -> NGFFGQRCLEKLPLRLYMPDPKQSVLWDT 
PGTGVSSSQWSTSPSTLDLN (in isoform DON-IS) . 
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/FTId=VSP_0034 62. 
Missing (in isoform DON-IS) . 
/FTId=VSP_0034 63. 

VGYTGDRCQQFAMVNFSKHLGFELKE -> NGFFGQRCLEK 
LPLRLYMPDPKQK (in isoform DON-1M) . 
/FTId=VSP_003464. 
SEQUENCE 756 AA; 82213 MW; 51D85DC918BE678E CRC64; 
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Query Match 84.3%; Score 1475; DB 1; Length 756; 

Best Local Similarity 95.8%; Pred. No. le-110; 

Matches 277; Conservative 6; Mismatches 6; Indels 
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RESULT 4 
NRG1_XENLA 

ID NRG1_XENLA STANDARD; PRT; 677 AA. 

AC 093383; Q9W6N0; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1] . 

GN NRG1 . 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus. 

OX NCBI_TaxID=8355; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM ALPHA1), AND ALTERNATIVE SPLICING. 

RX MEDLINE=98352126; PubMed=9685585 ; 

RA Yang J.F., Zhou H., Pun S., Ip N.Y., Peng H.B., Tsim K.W.K.; 

RT "Cloning of cDNAs encoding xenopus neuregulin: expression in myotomal 

RT muscle during embryo development."; 

RL Brain Res. Mol . Brain Res. 58:59-73(1998). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM CRD) . 

RX MEDLINE=99316087; PubMed-10383 827 ; 



RA Yang J.F., Zhou H., Choi R.C., Ip N.Y., Peng H.B., Tsim K.W.K.; 

RT A cysteine-rich form of Xenopus neuregulin induces the expression of 

RT acetylcholine receptors in cultured myotubes ; 

RL Mol. Cell. Neurosci. 13:415-429(1999). 

CC -!- FUNCTION: Direct ligand for the ERBB tyrosine kinase receptors. 

CC Induces expression of acetylcholine receptor in synaptic nuclei. 

CC -!- SUBCELLULAR LOCATION: Exists as a type I membrane protein and as a 

CC proteolytically released soluble growth factor form. The membrane- 

CC bound form does not seem to be active (By similarity) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

cc Comment=Additional isoforms seem to exist. Isoforms have alpha 

cc or beta-type EGF-like domains; 

CC Name=Alphal; 

cc Isold=093383-1; Sequence=Displayed; 

CC Name=CRD; Synonyms=CRD-NRGl, Cysteine rich domain; 

cc IsoId=093383-2; Sequence=VSP_003449, VSP_003450; 

CC -!- TISSUE SPECIFICITY: Isoform alphal is expressed in brain and 

CC muscle. Isoform CRD is expressed in brain and spinal cord, but at 

CC very low level in muscle. 

CC -!- DEVELOPMENTAL STAGE: Strong expression in developing brain and 

CC spinal cord of the embryo. Also expressed in the myotomal muscle. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 

cc of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

cc dimerization (By similarity) . 

CC DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain . 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 

CC similarity) . 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use^ by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF076618; AAC26804.1; -. 

DR EMBL; AF142632; AAD33893.1; -. 

DR HSSP; Q12784; 1HRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 

DR PRINTS; PR01089; NEUREGULIN. 



DR 

DR 

DR 

DR 

DR 

DR 

KW 

KW 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 



SMART; SM00181; EGF; 1. 
SMART; SM00408; IGc2; 1. 
PROSITE; PS00022; EGF_1; 1. 
PROSITE; PS01186; EGF_2 ; 1. 
PROSITE; PS50026; EGF_3; 1. 
PROSITE; PS50835; IG_LIKE; 1. 

Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 
Transmembrane; Alternative splicing. 



CHAIN 
CHAIN 

DOMAIN 
TRANS MEM 
DOMAIN 
DOMAIN 
DOMAIN 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
DOMAIN 
CARBOHYD 
CARBOHYD 
VARSPLIC 



1 
1 

1 

261 
281 

37 
188 

57 
192 
200 
222 
1 

124 
130 
1 



259 
677 

260 
280 
677 
132 
232 
116 
206 
220 
231 
25 
124 
130 
136 



NEUREGULIN ALPHA1 (BY SIMILARITY) . 

PRO-NEUREGULIN ALPHA1, MEMBRANE- BOUND 

FORM (BY SIMILARITY) . 

EXTRACELLULAR (POTENTIAL) . 

INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 

CYTOPLASMIC (POTENTIAL) . 

IG-LIKE C2-TYPE. 

EGF-LIKE. 

BY SIMILARITY. 

BY SIMILARITY. 

BY SIMILARITY. 

BY SIMILARITY. 

LYS-RICH. 

N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 



MAEKKKVKEGKGRKGKGKKDRKGKKAEGSDQGAAASPKLKE 
IKTQSVQEGKKLVLKCQAVSEQPSLKFRWFKGEKEIGAKNK 
PDSKPEHIKIRGKKKSSELQISKAS SADNGEYKCMVSNQLG 
NDTVTVNVTIVPK -> MSEDTAEGLQNQCSEQSSDPPSAE 
LQNEESMPETQDEEETTHGITGLAITCCVCLEADRLRICLN 
SEKICIIPILACLISLCLCIAGLKWVFVDKIFEYDSPTHLD 
PGHRGQDLILYTDTAPSTLVPSSVRTLPVIIPTTDSKAAVT 
FKFGTSLLPTE (in isoform CRD) . 
/FTId=VSP_003449. 

VARSPLIC 223 252 KPGFTGARCTETDPLRWRSEKHLGIEFME -> PNEFTGD 

RCQN YVMAS FYK (in isoform CRD) . 
/FTId=VSP_003450. 

SEQUENCE 677 AA; 75794 MW; 4 927 9E8F5BAE396F CRC64; 



Query Match 17.8%; Score 311.5; DB 1; 

Best Local Similarity 34.7%; Pred. No. 2.1e-17; 
Matches 78; Conservative 23; Mismatches 71; 



Length 677; 
Indels 53; 



Gaps 



5; 



QY 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



126 GKNLKKEVGKIL CTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDG 182 

II I II I I I I I : : I : I : I I : | : I I : I : | : ! | | , 

15 GKGKKDRKGKKAEGSDQGAAASPKLKEIKTQSVQEGKKLVLKCQAVSEQPSLKFRWFKGE 74 

183 KELNR S RDI RI KYGNGRKN S RLQFNKVKVEDAGE YVCEAENI LGKDTV 230 

I I : I : II : I : I I I : I I I I I I I M Ml 

75 KEIGAKNKPDSKPEHIKIRGKKKSSELQISKASSADNGEYKCMVSNQLGNDTVTVNVTIV 134 

231 RGRLYVNSVSTTL SSWSGHARKCNE 255 

: I I M :: :: | | | |:: 

135 PKPTYNHLLLMKIYLKVTSVEKSVEPSTLNLLESQKEVIFATTKRGDTTAGPGHLIKCSD 194 

256 TAKSYCVNGGVCYYIEGI NQLSCKCPNGFFGQRCLEKLPLRL 297 

I : I I I I I I I I = I I II III I I I I I I |||: 
195 KEKTYCVNGGECYVLNGITSSNQFMCKCKPGFTGARCTETDPLRV 239 



RESULT 5 
NRG1_HUMAN 

ID NRG1_HUMAN STANDARD; PRT; 639 AA. 

AC Q02297; 014667; P98202; Q02298; Q02299; Q07110; Q07111; Q12779; 

AC Q12780; Q12781; Q12782; Q12783; Q12784; Q9UPE3; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1 (Neu 

DE differentiation factor) (Heregulin) (HRG) (Breast cancer cell 

DE differentiation factor p45) (Acetylcholine receptor inducing activity) 

DE (ARIA) (Sensory and motor neuron-derived factor) (Glial growth 

DE factor) ] . 

GN NRG1 OR HGL OR NDF OR HRGA OR GGF OR SMDF . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N . A. (ISOFORMS 1; 6; 7 AND 8), AND PARTIAL 

RP SEQUENCE. 

RX MEDLINE=92271253; PubMed=1350381 ; 

RA Holmes W.E., Sliwkowski M.X., Akita R.W., Henzel W.J., Lee J., 

RA Park J.W., Yansura D., Abadi N.', Raab H., Lewis G.D., Shepard H.M., 

RA Kuang W.-J., Wood W.I., Goeddel D.V., Vandlen R.L.; 

RT "Identification of heregulin, a specific activator of pl85erbB2."; 

RL Science 256:1205-1210(1992). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 2; 3; 4; 6; 7 AND 8). 

RC TISSUE=Kidney adenocarcinoma, and Pituitary; 

RX MEDLINE=94158863; PubMed=750944 8 ; 

RA Wen D., Suggs S.V., Karunagaran D. , Liu N., Cupples R.L., Luo Y. , 

RA Janssen A.M., Ben-Baruch N., Trollinger D.B., Jacobsen V.L., 

RA Meng S.-Y., Lu H.S., Hu S., Chang D., Yang W. , Yanigahara D., 

RA Koski R.A. , Yarden Y. ; 

RT "Structural and functional aspects of the multiplicity of Neu 

RT differentiation factors."; 

RL Mol. Cell. Biol. 14:1909-1919(1994). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RX MEDLINE=92208945; PubMed=134 8215 ; 

RA Peles E . , Bacus S.S., Koski R.A. , Lu H.S., Wen D., Ogden S.G., 

RA Levy R.B., Yarden Y. ; 

RT "Isolation of the neu/HER-2 stimulatory ligand: a 44 kd glycoprotein 

RT that induces differentiation of mammary tumor cells."; 

RL Cell 69:205-216(1992). 

RN [4] 

RP SEQUENCE FROM N.A. (ISOFORMS 8 AND 9) . 

RC TISSUE=Brain; 

RX MEDLINE=93205115; PubMed=8 0 9 6067 ; 

RA Marchionni M.A., Goodearl A.D.J. , Chen M.S., Bermingham-McDonogh 0., 

RA Kirk C, Hendricks M. , Danehy F., Misumi D., Sudhalter J., 

RA Kobayashi K., Wroblewski D., Lynch C. , Baldasarre M., Hiles I., 

RA Davis J.B., Hsuan J. J., Totty N.F., Otsu M. , McBurney R.N., 

RA Water field M.D., Stroobant P., Gwynne D. ; 

RT "Glial growth factors are alternatively spliced erbB2 ligands 



RT expressed in the nervous system."; 

RL Nature 362:312-318(1993). 

RN [5] 

RP SEQUENCE FROM N.A. OF GAMMA-HEREGULIN FUSION PROTEIN. 

RC TISSUE=Breast cancer; 

RX MEDLINE=97472144; PubMed=9333014 ; 

RA Schaefer G. , Fitzpatrick V.D., Sliwkowski M.X.; 

RT "Gamma-heregulin: a novel heregulin isoform that is an autocrine 

RT growth factor for the human breast cancer cell line, MDA-MB-175 . "; 

RL Oncogene 15:1385-1394(1997). 

RN [6] 

RP SEQUENCE OF 1-210 FROM N.A. 

RA Schoumacher F., Herzer S., Flury N., Kueng W., Mueller H., 

RA Eppenberger U.; 

RL Submitted (SEP-1997) to the EMBL/ GenBank/DDB J databases. 

RN [7] 

RP SEQUENCE OF 19-27. 

RX MEDLINE=93366731; PubMed=7689552 ; 

RA Culouscou J.-M., Plowman G.D., Carlton G.W., Green J.M., Shoyab M. ; 

RT "Characterization of a breast cancer cell differentiation factor that 

RT specifically activates the HER4/pl80erbB4 receptor."; 

RL J. Biol. Chem. 268:18407-18410(1993). 

RN [8] 

RP CHROMOSOMAL TRANSLOCATION. 

RX MEDLINE=99455251; PubMed=10523851 ; 

RA Wang X.-Z., Jolicoeur E.M., Conte N., Chaffanet M. , Zhang Y. , 

RA Mozziconacci M.-J., Feiner H., Birnbaum D . , Pebusque M.-J., Ron D. ; 

RT "Gamma-heregulin is the product of a chromosomal translocation fusing 

RT the DOC4 and HGL/NRG1 genes in the MDA-MB-175 breast cancer cell 

RT line."; 

RL Oncogene 18:5718-5721(1999). 

RN [9] 

RP CHROMOSOMAL TRANSLOCATION. 

RX MEDLINE=20065180; PubMed=10597312 ; 

RA Liu X., Baker E., Eyre H.J., Sutherland G.R., Zhou M. ; 

RT "Gamma-heregulin: a fusion gene of DOC-4 and neuregulin-1 derived from 

RT a chromosome translocation."; 

RL Oncogene 18:7110-7114(1999). 

RN [10] 

RP STRUCTURE BY NMR OF 175-241 (ISOFORM 1). 

RX MEDLINE=94341264; PubMed=8062828 ; 

RA Nagata K., Kohda D., Hatanaka H., Ichikawa S., Matsuda S., 

RA Yamamoto T., Suzuki A., Inagaki F.; 

RT "Solution structure of the epidermal growth factor-like domain of 

RT heregulin-alpha, a ligand for pl8 OerbB-4 . " ; 

RL EMBO J. 13:3517-3523(1994). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 

CC receptors. Concomitantly recruits ERBB1 and ERBB2 coreceptors, 

CC resulting in ligand-stimulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. The multiple isoforms perform 

CC diverse functions such as inducing growth and differentiation of 

CC epithelial, glial, neuronal, and skeletal muscle cells; inducing 

CC expression of acetylcholine receptor in synaptic vessicles during 

cc the formation of the neuromuscular junction; stimulating 

CC lobuloalveolar budding and milk production in the mammary gland 

CC and inducing differentiation of mammary tumor cells; stimulating 

CC Schwann cell proliferation; implication in the development of the 



CC myocardium such as trabeculation of the developing heart. 

CC -!- SUBUNIT: The cytoplasmic domain interacts with the LIM domain 

CC region of LIMK1 (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Exists as an type I membrane protein and as 

CC a proteolytically released soluble growth factor form. The 

CC membrane-bound form does not seem to be active. The secreted 

CC isoform 9 has a signal peptide. The isoform 8 may be nuclear. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternati ve splicing; Named isoforms := 9; 

CC Comment=Additional isoforms seem to exist. Isoforms have been 

cc classified as type I NRGS (isoforms with an Ig domain and a 

cc glycosylation domain, isoforms 1-8), type II NRGS (isoforms with 

cc an Ig domain but no glycosylation domain, isoform 9) and type 

cc HI NRGS (isoforms with a Cys-rich domain, isoform 10) . All 

CC these isoforms perform distinct tissue-specific functions; 

CC Name=l; Synonyms=Alpha; 

CC IsoId=Q02297-l; Sequence=Displayed; 

CC Name=2 ; Synonyms =AlphalA; 

CC IsoId=Q02297-2; Sequence=VSP_003431 ; 

CC Name=3; Synonyms=Alpha2B; 

CC IsoId=Q02297-3; Sequence=VSP_003434 , VSP_003435; 

CC Name=4; Synonyms=Alpha3 ; 

CC IsoId=Q02297-4; Sequence=VSP_003432 , VSP_003433; 

CC Name=6; Synonyms=Betal, BetalA; 

CC IsoId=Q02297-6; Sequence=VSP_00342 8 ; 

CC Name=7; Synonyms=Beta2 ; 

CC IsoId=Q02297-7; Sequence=VSP_003427 ; 

CC Name=8; Synonyms=Beta3 , GGFHFB1; 

CC IsoId=Q02297-8; Sequence=VSP_003429, VSP_003430; 

CC Name=9; Synonyms=GGF2 , GGFHPP2; 

CC IsoId=Q02297-9; Sequence=VSP_003425, VSP_003426, VSP_003429, 

CC VSP_003430; 

CC Name=10; Synonyms=SMDF; 

CC IsoId=Q15491-l; Sequence=External ; 

CC -!- TISSUE SPECIFICITY: Type I isoforms are the predominant forms 

CC expressed in the endocardium. Isoform alpha is expressed in 

CC breast, ovary, testis, prostate, heart, skeletal muscle, lung, 

CC placenta liver, kidney, salivary gland, small intestine and brain, 

CC but not in uterus, stomach, pancreas, and spleen. Isoform 3 is the 

CC predominant form in mesenchymal cells and in nonneuronal organs, 

cc whereas isoform 5 is the major neuronal form. Isoform 8 is 

CC expressed in spinal cord and brain. Isoform 9 is the major form in 

cc skeletal muscle cells; in the nervous system it is expressed in 

CC spinal cord and brain. Also detected in adult heart, placenta, 

CC lung, liver, kidney, and pancreas. 

CC -!- DEVELOPMENTAL STAGE: Detectable at early embryonic ages. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 

CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain. 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 

CC similarity) . 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



DISEASE: Involved in a rare t(8;ll) chromosomal translocation that 
fuses the 5 ' end of ODZ4 to NRG1 (isoform 8). The product of this 
translocation was first thought to be an alternatively spliced 
isoform, called gamma-heregulin. Gamma-heregulin is a soluble 
activating ligand for the ERBB2-ERBB3 receptor complex and acts as 
an autocrine growth factor in a specific breast cancer cell line 
(MDA-MB-175) . Not detected in breast carcinoma samples, including 
ductal, lobular, medullary, and mucinous histological types, 
neither in other breast cancer cell lines. 
SIMILARITY: Contains 1 EGF-like domain. 

SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 
SIMILARITY: Belongs to the neuregulin family. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 
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AAA58640. 
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DR 


EMBL; 


M94168; 
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DR 


EMBL; 


U02325; 


AAA19950. 


1, 




DR 


EMBL; 


U02326; 


AAA19951. 


1, 




DR 


EMBL; 


U02327; 


AAA19952. 


1, 




DR 


EMBL; 


U02328; 


AAA19953. 


1, 




DR 


EMBL; 


U02329; 


AAA19954. 


1, 




DR 


EMBL; 


U02330; 


AAA19955. 


1, 




DR 


EMBL; 


L12260; 


AAB59622. 


1; 





Query Match 16.8%; Score 2 93.5; DB 1; Length 639; 

Best Local Similarity 32.1%; Pred. No. 5.6e-16; 

Matches 71; Conservative 37; Mismatches 62; Indels 51; Gaps 7; 
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RESULT 6 
NRG1_CHICK 

ID NRG1_CHICK STANDARD; PRT; 602 AA. 



AC Q05199; O73750; 073751; 073752; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1 

DE (Acetylcholine receptor inducing activity) (ARIA) ] . 

GN NRG1 OR ARIA. 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Gallif ormes ; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND PARTIAL SEQUENCE. 

RC STRAIN=White leghorn; TISSUE=Brain; 

RX MEDLINE=93201602; PubMed=845367 0; 

RA Falls D.L., Rosen K.M. , Corfas G., Lane W.S., Fischbach G.D.; 

RT "ARIA, a protein that stimulates acetylcholine receptor synthesis, is 

RT a member of the neu ligand family."; 

RL Cell 72:801-815(1993). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 2; 3 AND 4) . 

RC TISSUE=Brain, and Spinal cord; 

RX MEDLINE=98150951; PubMed-94 9198 7 ; 

RA Yang X., Kuo Y., Devay P., Yu C, Role L . ; 

RT "A cysteine-rich isof orm of neuregulin controls the level of 

RT expression of neuronal nicotinic receptor channels during 

RT synaptogenesis . " ; 

RL Neuron 20:255-270(1998). 

CC -!- FUNCTION: Direct ligand for the ERBB tyrosine kinase receptors. 
cc The multiple isoforms perform diverse functions: Cystein-rich 

CC domain containing isoforms (isoforms 2-4) probably regulate the 

CC expression of nicotinic acetylcholine receptors at developing 

CC interneuronal synapses. The Ig-NRG isof orm is required for the 

CC initial induction and/or maintenance of the mature levels of 

cc acetylcholine receptors at neuromuscular synapses. 

CC -!- SUBCELLULAR LOCATION: Exists as a type I membrane protein and as a 
CC proteolytically released soluble growth factor form. The membrane- 

CC bound form does not seem to be active (By similarity) . 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=4; 

CC Comment=Additional isoforms seem to exist; 

CC Name=l; Synonyms=ARIA, IG-NRG; 

CC IsoId=Q05199-l; Sequence=Displayed; 

CC Note=Contains an Ig-like domain; 

CC Name=2 ; Synonyms=CRD-NRG-BETAlA; 

CC IsoId=Q05199-2; Sequence=VSP_003445 ; 

CC Note=The EGF-like domain is replaced by a Cysteine-rich domain 

CC (CRD); 

CC Name=3; Synonyms=CRD-NRG-BETA2A; 

CC IsoId=Q05199-3; Sequence=VSP_003445, VSP_003446; 

cc Note=The EGF-like domain is replaced by a Cysteine-rich domain 

CC (CRD); 

CC Name=4; Synonyms=CRD-NRG-BETA2B; 

CC IsoId=Q05199-4; Sequence=VSP_003445, VSP_003446, VSP 003447, 

CC VSP_003448; 

cc Note=The EGF-like domain is replaced by a Cysteine-rich domain 



cc (CRD); 

CC -!- DEVELOPMENTAL STAGE: Isoforms 2-4 are detected at embryonic day 4 - 
cc (ED4) in both visceral and somatic motor neurons of spinal cord 

CC and is highest at ED6. Isoform 1 is not expressed until ED 6 in 

CC spinal cord. At ED 11 both isoforms display comparable levels. 

CC -!- DOMAIN: The cytoplasmic domain may be involved in the regulation 
CC of trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization (By similarity) . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 
CC domain . 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 
CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage (By 
CC similarity) . 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L11264; AAA49037.1; 

DR EMBL; AF045654; AAC05670.1; -. 

DR EMBL; AF045655; AAC05671.1; -. 

DR EMBL; AF045656; AAC05672.1; -. 

DR PIR; A45769; A45769. 

DR HSSP; Q12784; 1HRE. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR002154; Neuregulin. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF02158; Neuregulin; 1. 

DR PRINTS; PR01089; NEUREGULIN. 

DR SMART; SM00181; EGF; 1. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS00022; EGF_1; 1. 

DR PROSITE; PS01186; EGF_2; FALSE_NEG. 

DR PROSITE; PS50026; EGF_3; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

KW Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 

KW Transmembrane; Alternative splicing. 

FT CHAIN 1 602 PRO-NEUREGULIN-1, MEMBRANE- BOUND FORM. 

FT CHAIN 1 2 05 NEUREGULIN-1 . 

FT DOMAIN 1 206 EXTRACELLULAR (POTENTIAL). 

FT TRANSMEM 207 229 INTERNAL SIGNAL SEQUENCE (POTENTIAL) . 

FT DOMAIN 230 602 CYTOPLASMIC (POTENTIAL). 

FT DOMAIN 29 123 IG-LIKE C2-TYPE. 

FT DOMAIN 125 136 SER/THR-RICH . 
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BY SIMILARITY. 
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BY SIMILARITY. 
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169 


BY SIMILARITY. 


FT 


DISULFID 


171 


180 


BY SIMILARITY. 
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CARBOHYD 
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N-LINKED (GLCNAC. . .) (POTENTIAL) 
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CARBOHYD 
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N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 
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126 


N-LINKED (GLCNAC. . .) (POTENTIAL) 
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FT 








PHREDSRVPGVAGT. A ^TPTVPT FZXFPT yrn MQCifTn a dt 


FT 










FT 








STVDPTAT.SAWVPSF.VYA^ PFPT P^T F^^avwvnTnQciT \r 


FT 










FT 








ATETNLOTAPKLS (in i inform 9 icnf^rm ^ 


FT 










FT 








/ftth^v^p nn^4d^ 

/ r i x u. — voc dujuj, 


FT 


VARSPLIC 


191 


198 


iii.oo±uy V xii louiuuil o a.ilQ ISOxOxITl fl ) . 


FT 








/ j. ± x t-i — voir UvOtiD , 


FT 


VARSPLIC 


388 


405 


VS AMTT P ARMS P VD FHT P -> HTPPTSLLLAGKVSLRVS 


FT 








(in isoforrti 4) . 


FT 








/FTId=VSP 00^447 


FT 


VARSPLIC 


406 


602 


iuooiiiy \ xii louioiul *i / ■ 


FT 








/FTId=VSP 003448. 


SQ 


SEQUENCE 


602 AA; 


67453 


MW; 4183C0E56CE5D346 CRC64; 


Query Match 




16. 3* 


\; Score 284.5; DB 1; Length 602; 


Best Local Similarity 


32.8^ 


h; Pred. No. 2.7e-15; 


Matches 62; 


Conservative 


34; Mismatches 72; Indels 21; Gaps 



Qy 109 TEQPLVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAA 168 

: I I I : I II : I I I I : I I : I I I : I I : I I 

Db 5 SEGPLQYSLAPTQTDVNS S YNTVP PKLKEMKNQEVAVGQKLVLRCETT 52 



QY 169 AGNPQPSYRWFKDGKEL NRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENIL 225 

• I ::| hill: II :::| MM : : Mill I : I 

Db 53 SEYPALRFKWLKNGKEITKKNRPENVKIP-KKQKKYSELHIYRATLADAGEYACRVSSKL 111 

Qy 226 GKDTVRGRLYVNSVSTTLSSWSG — HARKCNETAKSYCVNGGVCYYIEGI NQLSCKC 2 80 

I I : : = : : | : | : | | | | : I : : I M M I I : : : : I : I 

Db 112 GNDSTKASVIITDTNATSTSTTGTSHLTKCDIKQKAFCVNGGECYMVKDLPNPPRYLCRC 171 

Qy 281 PNGFFGQRC 289 

I I I I I I 
Db 172 PNEFTGDRC 180 



RESULT 7 
NRGl^RAT 

ID NRG1_RAT STANDARD; PRT ; 662 AA. 

AC P43322; P43323; P43324; P43325; P43326; P43327; P43328; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Pro-neuregulin-1 precursor (Pro-NRGl) [Contains: Neuregulin-1 (Neu 



DE differentiation factor) (Heregulin) (HRG) (Acetylcholine receptor 

DE inducing activity) (ARIA) (Sensory and motor neuron-derived factor) 

DE (Glial growth factor)]. 

GN NRG1 OR NDF. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. , AND ALTERNATIVE SPLICING. 

RC TISSUE=Fibroblast; 

RX MEDLINE=94158863; PubMed=7 50944 8 ; 

RA Wen D., Suggs S.V., Karunagaran D., Liu N., Cupples R.L., Luo Y., 

RA Janssen A.M., Ben-Baruch N., Trollinger D.B., Jacobsen V.L., 

RA Meng S.-Y., Lu H.S., Hu S., Chang D., Yang W., Yanigahara D., 

RA Koski R.A., Yarden Y. ; 

RT "Structural and functional aspects of the multiplicity of Neu 

RT differentiation factors."; 

RL Mol. Cell. Biol. 14:1909-1919(1994). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM ALPHA2C/NDF4 4 ) , AND PARTIAL SEQUENCE. 

RC TISSUE=Fibroblast; 

RX MEDLINE=92257596; PubMed=134 9853 ; 

RA Wen D., Peles E., Cupples R. , Suggs S.V. , Bacus S.S., Luo Y. , 

RA Trail G., Hu S., Silbiger S.M., Levy R.B., Koski R.A., Lu H.S., 

RA Yarden Y. ; 

RT "Neu differentiation factor: a transmembrane glycoprotein containing 

RT an EGF domain and an immunoglobulin homology unit."; 

RL Cell 69:559-572(1992). 

RN [3] 

RP SEQUENCE OF 14-36. 

RX MEDLINE=92208945; PubMed=134 82 15 ; 

RA Peles E., Bacus S.S., Koski R.A. , Lu H.S., Wen D., Ogden S.G., 

RA Levy R.B., Yarden Y.; 

RT "Isolation of the neu/HER-2 stimulatory ligand: a 44 kd glycoprotein 

RT that induces differentiation of mammary tumor cells."; 

RL Cell 69:205-216(1992). 

RN [4] 

RP REGULATION OF PROCESSING (ISOFORM ALPHA2C/NDF44 ) . 

RX MEDLINE=99069430; PubMed-9852 099 ; 

RA Liu X., Hwang H., Cao L. , Wen D., Liu N. , Graham R.M., Zhou M. ; 

RT "Release of the neuregulin functional polypeptide requires its 

RT cytoplasmic tail."; 

RL J. Biol. Chem. 273:34335-34340(1998). 

RN [5] 

RP INTERACTION WITH LIMK1 . 

RX MEDLINE=98352096; PubMed=96854 09 ; 

RA Wang J.Y., Frenzel K.E., Wen D., Falls D.L.; 

RT "Transmembrane neuregulins interact with LIM kinase 1, a cytoplasmic 

RT protein kinase implicated in development of visuospatial cognition."; 

RL J. Biol. Chem. 273:20525-20534(1998). 

CC -!- FUNCTION: Direct ligand for ERBB3 and ERBB4 tyrosine kinase 
CC receptors. Concomitantly recruits ERBB1 and ERBB2 coreceptors, 

CC resulting in ligand-s timulated tyrosine phosphorylation and 

CC activation of the ERBB receptors. The multiple isoforms perform 

CC diverse functions such as inducing growth and differentiation of 

CC epithelial, glial, neuronal, and skeletal muscle cells; inducing 



cc expression of acetylcholine receptor in synaptic vessicles during 

CC the formation of the neuromuscular junction; stimulating 

CC lobuloalveolar budding and milk production in the mammary gland 

CC and inducing differentiation of mammary tumor cells; stimulating 

CC Schwann cell proliferation; implication in the development of the 

CC myocardium such as trabeculation of the developing heart (By 

CC similarity) . 

CC -!- SUBUNIT: The cytoplasmic domain interacts with the LIM domain 

CC region of LIMK1 . 

CC -!- SUBCELLULAR LOCATION: Exists as a type I membrane protein and as a 

CC proteolytically released soluble growth factor form. The membrane- 

CC bound form does not seem to be active. 

CC ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=8; 

CC Comment=Additional isoforms seem to exist; 

CC Name=Beta4; Synonyms=NDF42A; 

CC IsoId=P43322-l; Sequence=Displayed; 

CC Name=Alpha2A; Synonyms=NDF38 ; 

CC IsoId=P43322-2; Sequence=VSP_003436; 

CC Name=Alpha2B; Synonyms=NDF19 ; 

CC IsoId=P43322-3; Sequence=VSP_003436, VSP_003443, VSP_003444; 

CC Name=Alpha2C; Synonyms=NDF44 ; 

CC IsoId=P43322-4; Sequence=VSP_003436, VSP_003442; 

CC Name=Betal; 

CC IsoId=P43322-5; Sequence=VSP_003437 ; 

CC Name=Beta2; Synonyms=NDF40 ; 

CC IsoId=P43322-6; Sequence=VSP_003440, VSP_003441; 

CC Name=BetA2A; Synonyms=NDF22 ; 

CC IsoId=P43322-7; Sequence=VSP_003440 ; 

CC Name=Beta3; Synonyms=NDF4 ; 

CC IsoId=P43322-8; Sequence=VSP_003438 , VSP_003439; 

CC TISSUE SPECIFICITY: Widely expressed. Most tissues contain alpha2A 

CC and alpha2B isoforms. Alpha2 and beta2 are the predominant forms 

CC in mesenchymal and nonneuronal organs. Betal is enriched in brain 

CC and spinal cord, but not in muscle and heart. Alpha2C is highly 

CC expressed in spinal cord, moderately in lung, brain, ovary, and 

CC stomach, in low amounts in the kidney, skin and heart and not 

CC detected in the liver, spleen, and placenta. 

CC -!- DOMAIN: The cytopasmic domain may be involved in the regulation of 

CC trafficking and proteolytic processing. Regulation of the 

CC proteolytic processing involves initial intracellular domain 

CC dimerization . 

CC -!- DOMAIN: ERBB receptor binding is elicited entirely by the EGF-like 

CC domain. 

CC -!- PTM: Proteolytic cleavage close to the plasma membrane on the 

CC external face leads to the release of the soluble growth factor 

CC form. 

CC -!- PTM: Extensive glycosylation precedes the proteolytic cleavage. 

CC -I- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- SIMILARITY: Belongs to the neuregulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 



entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



CC 
DR 


FMRT, 

Hii l.D_l_l 


f /\AAiyy4U.i 




DR 


FMRT, 


r uuzjIO/ AAA±yy41.1 




DR 


EMBL 






DR 


EMBL 


' U02318; AAA19943.1 




DR 




? U02319; AAA19944.1 




DR 




f U02320; AAA19945.1, 




hd 
ur\ 


FMRT 


• U02321; AAA19946.1, 




DR 




U02322; AAA19947.1, 




DR 


EMBL/ 


U02323; AAA19948.1, 




DR 


EMBL, 


U02324; AAA19949.1, 




DR 


EMBL, 


M92430; -; NOT ANNOTATED CDS. 




PIR; 


161718; 161718. 




DR 


PIR; 


161719; 161719. 




FiR 


PIR; 


161722; 161722. 




LJ i\ 


HSSP; 


Q12784; 1HRE . 




DR 


InterPro; IPR006209; EGF 


like. 


FiR 


InterPro; IPR006210; IEGF. 


DR 


InterPro; IPR007110; Ig-like. 


DR 


InterPro; IPR003598; Ig c2 . 


FiR 


InterPro; IPR002154; Neuregulin. 


DR 


Pfam; 


PF00008; EGF; 1. 




FiR 


Pfam; 


PF00047; ig; 1. 




FiR 


Pfam; 


PF02158; Neuregulin; 1. 


DR 


PRINTS; PR01089; NEUREGULIN. 


DR 


SMART; SM00181; EGF; 1. 




FIR 


SMART; SM00408; IGc2; 1. 




DR 


PROSITE; PS00022; EGF 1; 


1. 


DR 


PROSITE; PS01186; EGF 2; 


FALSE NEG. 


DR 


PROSITE; PS50026; EGF 3; 


1. 


DR 


PROSITE; PS50835; IG LIKE 


; 1. 


KW 


Growth factor; EGF-like domain; Immunoglobulin domain; Glycoprotein; 


KW 


Transmembrane; Multigene 


family; Alternative splicing. 


FT 


PROPEP 1 13 


FT 


CHAIN 


14 662 


PRO-NEUREGULIN-1, MEMBRANE- BOUND FORM. 


FT 


CHAIN 


14 264 


NEUREGULIN- 1. 


FT 


DOMAIN 14 2 65 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 266 288 


INTERNAL SIGNAL SEQUENCE ( POTENTIAL) . 


FT 


DOMAIN 28 9 662 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 37 128 


IG-LIKE C2-TYPE. 


FT 


DOMAIN 165 177 


SER/THR-RICH. 


FT 


DOMAIN 178 222 


EGF-LIKE. 


FT 


DISULFID 57 112 




FT 


DISULFID 182 196 


BY SIMILARITY. 


FT 


DISULFID 190 210 


BY SIMILARITY. 


FT 


DISULFID 212 221 


BY SIMILARITY. 


FT 


CARBOHYD 120 12 0 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 126 126 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 164 164 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 213 256 


PNEFTGDRCQNYVMAS FYMTSRRKRQETEKPLERKLDHSLV 


FT 






KES -> QPGFTGARCTENVPMKVQTQE (in isoform 


FT 






Alpha2A, isoform Alpha2B and isoform 


FT 






Alpha2C) . 


FT 






/FTId=VSP_003436. 


FT 


VARSPLIC 231 257 


MTSRRKRQETEKPLERKLDHSLVKESK -> KHLGIEFME 



FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 



(in isoform Betal) . 
/FTId=VSP_003437. 

VARSPLIC 231 241 MTSRRKRQETE -> STSTPFLSLPE (in isoform 

Beta3) . 

/FTId=VSP_003438. 
VARSPLIC 242 662 Missing (in isoform Beta3). 

/FTId=VSP_003439. 

VARSPLIC 231 256 Missing (in isoform Beta2 and isoform 

BetA2A) . 

/FTId=VSP_003440. 
VARSPLIC 325 330 PPENVQ -> RVRTRG (in isoform Beta2). 

/FTId=VSP_003441. 
VARSPLIC 446 662 Missing (in isoform Alpha2C) . 

/FTId=VSP_003442. 

VARSPLIC 446 484 YVSAMTTPARMSPVDFHTPSSPKSPPSEMSPPVSSMTVS 

-> HNLIAELRRNKAYRSKCMQIQLSATHLRPSSITHLGFI 

L (in isoform Alpha2B) . 

/FTId=VSP__003443. 
VARSPLIC 485 662 Missing (in isoform Alpha2B) . 

/FTId=VSP_003444. 
CONFLICT 90 90 K -> N (IN REF. 2). 

CONFLICT 137 137 T -> I (IN REF. 2; AA SEQUENCE) . 

CONFLICT 208 208 Y -> S (IN REF. 2) . 



Query Match 16.2%; Score 283; DB 1; Length 662; 

Best Local Similarity 33.0%; Pred. No. 4e-15; 

Matches 66; Conservative 32; Mismatches 58; Indels 44; 



Gaps 



6; 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



142 ATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRS RDIRIKYGNG 198 

I I : I I : I I I I II I : I I : : : : I I I : I I I I I : I : I : I 

34 ALPPRLKEMKSQESAAGSKLVLRCETSSEYSSLRFKWFKNGNELNRKNKPENIKIQKKPG 93 

199 RKNSRLQFNKVKVEDAGEYVCEAENI LGKDTVRGRL YV 236 

: I I : I I : I : I II : I : : II I : : | | 

94 K— SELRINKASLADSGEYMCKVISKLGNDSASANITIVESNEFITGMPASTETAYVSSE 151 

237 NSVSTTLSSWSG — HARKCNETAKS YCVNGGVCYYI EGINQLS CK 279 

I : I : I : I : I I III I : : I I I I I I : : : : : I | | 
152 SPIRISVSTEGANTSSSTSTSTTGTSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCK 211 

280 CPNGFFGQRCLEKLPLRLYM 2 99 

I II I I I I : || 
212 CPNEFTGDRCQNYVMAS FYM 231 



RESULT 8 
VEIN_DROME 

ID VEIN_DROME STANDARD; PRT; 623 AA. 

AC Q94918; Q9VRQ3; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Vein protein precursor (Epidermal growth factor-like protein) 
DE (Defective dorsal discs protein) . 
GN VN OR DDD OR CG104 91. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 



OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha ; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N . A. (ISOFORMS 1 AND 2) . 

RC TISSUE=Embryo, and Imaginal disks; 

RX MEDLINE=96421972; PubMed=8824 58 9; 

RA Schnepp B.C., Grumbling G.B., Donaldson T.D., Simcox A. A. ; 

RT "Vein is a novel component in the Drosophila epidermal growth factor 

RT receptor pathway with similarity to the neuregulins . " ; 

RL Genes Dev. 10:2302-2313(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklcs G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch C, Baldwin D. , 

RA Ballew R.M. , Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J. , Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H., Cadieu E. , Center A., Chandra I., 

RA Cherry J.M. , Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K . , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. , 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z . , Guan P., Harris M. , 

RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F. , Karpen G.H., Ke Z., Kennison J. A. , Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A. , Li J.H., Li Z., Liang Y., Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G. , Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B . , Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M. , Strong R. , Sun E., 

RA Svirskas R., Tector C, Turner R. , Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G. , Zhao Q. , Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R. A. , Myers E.W., Rubin G.M. , Venter J.C.; 

RT "The genome sequence of Drosophila melanogas ter . " ; 

RL Science 287:2185-2195(2000). 

RN [3] 

RP REVISIONS. 

RX MEDLINE=22426069; PubMed=12537572 ; 

RA Misra S., Crosby M. A. , Mungall C.J., Matthews B.B., Campbell K.S., 



RA Hradecky P., Huang Y. , Kaminker J.S., Millburn G.H., Prochnik S.E., 

RA Smith CD., Tupy J.L., Whitfield E.J., Bayraktaroglu L., Berman B.P., 

RA Bettencourt B.R., Celniker S.E., de Grey A. D.N. J., Drysdale R.A., 

RA Harris N.L., Richter J., Russo S., Schroeder A. J., Shu S.Q., 

RA Stapleton M. , Yamada C, Ashburner M. , Gelbart W.M., Rubin G.M., 

RA Lewis S.E.; 

RT "Annotation of the Drosophila melanogaster euchromatic genome: a 

RT systematic review."; 

RL Genome Biol. 3 : RESEARCH0083 . 1-RESEARCH008 3 . 22 (2002 ) . 

CC -!- FUNCTION: Ligand for the EGF receptor. Seems to play a role in 

CC the global proliferation of wing disc cells and the larval 

CC patterning. Shows a strong synergistic genetic interaction with 

CC spi, suggesting a molecular interdependence. Required for the 

CC development of interveins cells . 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; 

CC IsoId=Q94918-l; Sequence=Displayed; 

CC Name=2 ; 

CC IsoId=Q94918-2; Sequencers P_00141 9; 

CC -!- DEVELOPMENTAL STAGE: Expressed in blastoderm embryos in two 

CC ventrolateral stripes that are brought to the midline as 

CC gastrulation proceeds. In the germ-band retraction stage, 

CC expression is seen in the CNS and epidermis. At late blastoderm, 

CC expression is localized in the anlagen of the amnioserosa. 

CC Expression in the head, cypeolabrum, maxillary and labial lobes, 

CC and around the stomodeum throughout embryo development. In late 

CC embryos, expression decays in all ectodermal cells and appears in 

CC the segmental muscles and the gut wall. In the larva, expression 

CC occurs in the dorsal metathoracic disc, the eye-antennal disc and 

CC the ventral thoracic disc. Found in the intervein in the pupa. 

CC -!- SIMILARITY: Contains 1 EGF-like domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U67935; AAC47293.1; -. 

DR EMBL; AE003564; AAF50739.2; -. 

DR FlyBase; FBgn0003984; vn. 

DR GO; GO: 0005576; C : extracellular ; NAS . 

DR GO; GO: 0005154; F: epidermal growth factor receptor binding; IMP. 

DR GO; GO:0007173; P : EGF receptor signaling pathway; IMP. 

DR GO; GO: 0007477; P:notum morphogenesis; IMP. 

DR GO; GO: 0045742; P:positive regulation of EGF receptor signali. . .; NAS. 

DR GO; GO: 0007476; P:wing morphogenesis; IMP. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR Pfam; PF00008; EGF; 1. 



DR 


Pfam; PF00047; ig 


; l. 








DR 


SMART; SM00181; EGF; 1. 








DR 


SMART; SM00408; IGc2; 1. 
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PROSITE; 


PS00022; 


EGF 1; 


1. 






DR 


PROSITE; 
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EGF 2; 


FALSE NEG. 




DR 


PROSITE; 


PS50026; 


EGF 3; 


1. 






DR 


PROSITE; 


PS50835; 


IG LIKE; 1 






KW 


EGF-like 


domain; 


Glycoprotein; Immunoglobulin domain; Growth factor 


KW 


Developmental protein; Alternative splicing; Signal. 
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DOMAIN 


138 


314 




GLN-RICH. 




FT 


DISULFID 


4 /o 


oil 




BY SIMILARITY. 




FT 


CARBOHYD 


7 6 


76 




N- LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


211 


211 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


252 


252 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


^ so 

~J \J 


■D -J \J 






. ) [ rUi CiN 1 1AL) . 


FT 


CARBOHYD 


381 


381 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


424 


424 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


449 


449 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 




CARBOHYD 


521 


521 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


574 


574 




N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 




VARSPLIC 


609 


623 




FVAIYGQIHTLNNDY -> 


SSPESCKNYQGGYY (in 


FT 










isof orm 2 ) . 




FT 










/FTId=VSP_0 01419. 




FT 


CONFLICT 


149 


149 




MISSING (IN REF. 1) 




FT 


CONFLICT 


220 


220 




S -> T (IN REF. 1) . 




FT 


CONFLICT 


494 


494 




E -> D (IN REF. 1) . 




SQ 


SEQUENCE 


623 AA 


; 71697 


MW; 


AFD2724D5C1F56C8 


CRC64; 


Query Match 




12.4 


%; 


Score 216.5; DB 1; 


Length 623; 



Best Local Similarity 26.1%; Pred. No. 8.2e-10; 

Matches 81; Conservative 41; Mismatches 107; Indels 81; Gaps 15; 

QY 32 AYKAPVWEG KVQGLVPAGGSSSNSTREPPA 62 

I : I I I : I : : I I : I I I : I 

Db 329 7VFAAPTVFQGVFKSMSADRRWFSATMKVEKVYKQQHDLQLPTLVRLQFALSNSSGECD- 387 

Qy 63 SGRVALVKVLDKWPLRSGG-LQREQVISVGSCVPLERNQRYIFFLEPTE QP- 112 

: : : : : I I I I | | : | | I : | : : I | | 

Db 388 IYRERLMPRGMLRSGNDLQQASDIS YMMFVQQTNPGNFTILGQPM 432 

Qy 113 LVFKTAFAPLDTNGKNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAA 168 

II: : I I I I I : : I : I | : | : | I 

Db 433 RVTHL WEAVE TAVS EN- YTQNAEVT KI F SKPSKAIIKH GKKLRIVCE-V 48 0 

Qy 169 AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRL QFNKVKVEDAGEYVCEAENIL 225 

•I I I MM I :li 1:1 :: : :: | | || III I I hi 

Db 481 SGQPPPKVTWFKDEKSINRKRNI-YQFKHHKRRSELIVRSFN — SSSDAGRYECRAKNKA 537 

QY 226 GKDTVRGRLYVNSVSTTL-SSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 284 

I : I : : : : I I I I I : I I I I I : : I : I I : 

Db 538 SKAIAKRRIMI KASPVHFPTDRSASGI PCN FDYCFHNGTCRMI PDINEVYCRCPTEY 594 



Qy 285 FGQRCLEKLP 294 

I I I I II 
Db 595 FGNRCENKWP 604 



RESULT 9 
JAM2_HUMAN 

ID JAM2_HUMAN STANDARD; PRT; 2 98 AA. 

AC P57087; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Junctional adhesion molecule 2 precursor (Vascular endothelial 

DE junction-associated molecule) (VE-JAM) . 

GN JAM2 OR VEJAM OR C210RF43. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Vascular endothelial cells; 

RX MEDLINE=20317114; PubMed=10779521 ; 

RA Palmeri D., van Zante A., Huang C.C., Hemmerich S., Rosen S.D.; 

RT "Vascular endothelial junction-associated molecule, a novel member o 

RT the immunoglobulin superfamily, is localized to intercellular 

RT boundaries of endothelial cells."; 

RL J. Biol. Chem. 275:19139-19145(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta; 

RX MEDLINE=2 0507 93 0; PubMed=1094 5976; 

RA Cunningham S.A. , Arrate M.P., Rodriguez J.M., Bjercke R.J., 

RA Vanderslice P., Morris A. P., Brock T.A.; 

RT "A novel protein with homology to the junctional adhesion molecule: 

RT Characterization of leukocyte interactions."; 

RL J. Biol. Chem. 275:34750-34756(2000). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Lung; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L . , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K. , Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E. 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J. 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W. 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A. 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schinutz J., Myers R.M., 



RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch- A. , Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

CC FUNCTION: MAY PLAY A ROLE IN THE PROCESSES OF LYMPHOCYTE HOMING TO 

CC SECONDARY LYMPHOID ORGANS. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein (Potential). 

CC -!- TISSUE SPECIFICITY: PROMINENTLY EXPRESSED ON HIGH ENDOTHELIAL 
CC VENULES BUT IS ALSO PRESENT ON THE ENDOTHELIA OF OTHER VESSELS. 

CC LOCALIZED TO THE INTERCELLULAR BOUNDARIES OF HIGH ENDOTHELIAL 

CC CELLS. 

CC -!- SIMILARITY: Belongs to the immunoglobulin superfamily. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like V-type domain. 

CC -!- SIMILARITY: Contains 1 immunoglobulin-like C2-type domain. 

CC -!- DATABASE: NAME=PROW; NOTE=PROW 2:1-3(2001); 

CC WWW="http: //www. ncbi.nlm.nih. gov/prow/guide/ 16524 9218 6_g.htm" . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF255910; AAF81223.1; -. 

DR EMBL; AY016009; AAG49022.1; -. 

DR EMBL; BC017779; AAH17779.1; -. 

DR Genew; HGNC: 14686; JAM2 . 

DR MIM; 606870; -. 

DR GO; GO: 0005887; C:integral to plasma membrane; NAS . 

DR GO; GO:0016337; P: cell-cell adhesion; NAS. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00047; ig; 2. 

DR SMART; SM00408; IGc2; 1. 

DR PROSITE; PS50835; IG_LIKE; 2. 

KW Immunoglobulin domain; Glycoprotein; Transmembrane; Signal. 



FT 


SIGNAL 


1 


20 


POTENTIAL. 




FT 


CHAIN 


21 


298 


JUNCTIONAL ADHESION MOLECULE 2. 


FT 


DOMAIN 


21 


238 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


239 


259 


POTENTIAL. 




FT 


DOMAIN 


260 


298 


CYTOPLASMI C ( POTENTIAL ) 




FT 


DOMAIN 


32 


127 


IG-LIKE V-TYPE. 




FT 


DOMAIN 


134 


238 


IG-LIKE C2-TYPE . 




FT 


DISULFID 


50 


109 


POTENTIAL. 




FT 


DISULFID 


155 


214 


POTENTIAL. 




FT 


CARBOHYD 


98 


98 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) 


FT 


CARBOHYD 


187 


187 


N- LINKED (GLCNAC. . .) 


(POTENTIAL) 


FT 


CARBOHYD 


236 


236 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) 


SQ 


SEQUENCE 


298 AA; 


33207 


MW; CA78E518E22DCAEE CRC64 ; 



Query Match 8.9%; Score 155; DB 1; Length 298; 

Best Local Similarity 27.7%; Pred. No. 2.9e-05; 

Matches 56; Conservative 24; Mismatches 86; Indels 36; Gaps 



Qy 96 LERNQRYIFFLEPTEQPLVFKTAFAPLDTN — GKNL-KKEVGKILCTDCATRPKLKKMKS 152 

II: : : : : : : II : I I I I : : : | | | | : : : : 

Db 66 LGRSVS FVYYQQTLQGD — FKNRAEMI DFNI RI KNVTRS DAGKYRCEVSAP SEQGQNLEE 123 

Qy 153 QTGQV GEKQSLKCEAAAGNPQPSYRWFKDGKELNR -S 188 

I : I I : I : M I I I I I I I I I I 

Db 124 DTVTLEVLVAPAVPSCEVPSSALSGTWELRCQDKEGNPAPEYTWFKDGIRLLENPRLGS 183 

Qy 189 RDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRG-RLYVNSVSTTLSSWS 247 

: I I I I I I I I I I I I I I I : I ||:|::: I 
Db 184 QSTNSSYTMNTKTGTLQFNTVSKLDTGEYSCEARNSVGYRRCPGKRMQVDDLNI S 238 

Qy 248 GHARKCNETAKSYCVNG-GVCY 2 68 

I I I I I I I I 

Db 239 GIIAAWWALVISVCGLGVCY 2 60 



RESULT 1G 
LAMP RAT 



ID LAMP_RAT STANDARD; PRT; 338 AA. 

AC Q62813; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Limbic system-associated membrane protein precursor (LSAMP) . 

GN LSAMP OR LAMP. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. , AND SEQUENCE OF 29-49. 

RC TISSUE^Hippocampus ; 

RX MEDLINE=95374785; PubMed=7646886; 

RA Pimenta A.F., Zhukareva V., Barbe M.F., Reinoso B.S., Grimley C, 

RA Henzel W. , Fischer I., Levitt P.; 

RT "The limbic system-associated membrane protein is an Ig superf amily 

RT member that mediates selective neuronal growth and axon targeting."; 

RL Neuron 15:287-297(1995). 

CC -!- FUNCTION: MEDIATES SELECTIVE NEURONAL GROWTH AND AXON TARGETING. 

CC CONTRIBUTES TO THE GUIDANCE OF DEVELOPING AXONS AND REMODELING OF 

CC MATURE CIRCUITS IN THE LIMBIC SYSTEM. ESSENTIAL FOR NORMAL GROWTH 

CC OF THE HYPPOCAMPAL MOSSY FIBER PROJECTION. 

CC -!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 

CC -!- TISSUE SPECIFICITY: Expressed mostly by neurons comprising limbiC- 

CC associated cortical and subcortical regions that function in 

CC cognition, emotion, memory, and learning. 

CC -!- DEVELOPMENTAL STAGE: FIRST DETECTED AT E15-16, AT STAGE E20 IT IS 

CC DETECTED IN PRESUMPTIVE CORTEX, MEDIAL LIMBIC AREAS OF THE 

CC THALAMUS AND HYPOTHALAMUS. IN THE ADULT, IT IS FOUND IN 

CC HYPOTHALAMUS, PERIRHINAL CORTEX, AMYGDALA AND MEDIAL THALAMIC 

CC REGION. 

CC -!- SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 

CC family. 

CC -!- SIMILARITY: Contains 3 immunoglobulin-like C2-type domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 



cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for cornmercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; U31554; AAA86120.1; 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
Pfam; PF00047; ig; 3. 
SMART; SM00408; IGc2; 2. 
PROSITE; PS50835; IG_LIKE; 3. 

Immunoglobulin domain; Cell adhesion; Glycoprotein; GPI-anchor; 



LIMBIC SYSTEM-ASSOCIATED MEMBRANE 
PROTEIN. 

REMOVED IN MATURE FORM (POTENTIAL) . 
IG-LIKE C2-TYPE 1. 
IG-LIKE C2-TYPE 2. 
IG-LIKE C2-TYPE 3. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL . 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 



Repeat; 


Signal ; 


Lipopr 


SIGNAL 


1 


28 


CHAIN 


29 


315 


PROPEP 


316 


338 


DOMAIN 


29 


122 


DOMAIN 


132 


214 


DOMAIN 


219 


304 


DISULFID 


53 


111 


DISULFID 


153 


197 


DISULFID 


239 


290 


CARBOHYD 


40 


40 


CARBOHYD 


66 


66 


CARBOHYD 


136 


136 


CARBOHYD 


148 


148 


CARBOHYD 


279 


279 


CARBOHYD 


287 


287 


CARBOHYD 


300 


300 


CARBOHYD 


315 


315 


LIPID 


315 


315 



SEQUENCE 338 AA; 37324 MW; 



N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
GPI-anchor amidated 
(Potential) . 

0B76AFDD68A39BB6 CRC64; 



(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
( POTENTIAL) . 
asparagine 



Query Match 8.3%; Score 145; DB 1; Length 338; 

Best Local Similarity 23.9%; Pred. No. 0.00021; 

Matches 55; Conservative 31; Mismatches 92; Indels 52; 



Gaps 



9; 



Qy 

Db 



53 SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVI SVGSCVPLERNQRYI FFLEPTEQP 112 
I : I I I : I : I I : : : : : I I I I : : I 

112 SVQTQHEPKTSQVYLIVQV PPKISNISSDVTVNEGSNVTL VCMANGRPEP 161 



Qy 

Db 

Qy 

Db 

Qy 

Db 



113 LVFKTAFAPL DTNGKNLKKEVGKI LCTDCAT RPKL 147 

: : II : : | | | : : | | : 

162 VITWRHLTPLGREFEGEEEYLEILGITREQSGKYECKAANEVSSADVKQVKVTVNYPPTI 221 

148 KKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFN 207 

: II I : I I I I I I : I I I : I : : I : I : : I I I : II 

222 TESKSNEATTGRQASLKCEASA-VPAPDFEWYRDDTRINSANGLEIKSTEGQ — SSLTVT 278 

208 KVKVE DAG E YVC EAEN I L G KDTVRGRLYVN-SVSTTLSSW 246 

I I I I I I I I I : I I I : | | : I : I 
279 NVTEEHYGNYTCVAANKLGVTNASLVLFRPGSVRG INGSISLAVPLW 325 



RESULT 11 
LAMP HUMAN 



ID LAMP_HUMAN STANDARD; PRT; 338 AA. 

AC Q13449; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Limbic system-associated membrane protein precursor (LSAMP) . 

GN LSAMP OR LAMP. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJTaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96235133; PubMed=866624 3 ; 

RA Pimenta A.F., Fischer I., Levitt P.; 

RT "cDNA cloning and structural analysis of the human limbic-system- 

RT associated membrane protein (LAMP)."; 

RL Gene 170:189-195(1996). 

CC -!- FUNCTION: MEDIATES SELECTIVE NEURONAL GROWTH AND AXON TARGETING. 
CC CONTRIBUTES TO THE GUIDANCE OF DEVELOPING AXONS AND REMODELING OF 

CC MATURE CIRCUITS IN THE LIMBIC SYSTEM. ESSENTIAL FOR NORMAL GROWTH 

CC OF THE HYPPOCAMPAL MOSSY FIBER PROJECTION (BY SIMILARITY) . 

CC -!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 

CC -!- TISSUE SPECIFICITY: Expressed on limbic neurons and fiber tracts 
CC as well as in single layers of the superior colliculus, spinal 

CC chord and cerebellum. 

CC -!- SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 
CC family. 

CC -!- SIMILARITY: Contains 3 immunoglobulin-like C2-type domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; U41901; AAC50569.1; 

DR PIR; JC4776; JC4776. 

DR Genew; HGNC:67 05; LSAMP. 

DR MIM; 603241; -. 

DR GO; GO: 0007399; P : neurogenesis ; TAS . 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR Pfam; PF00047; ig; 3. 

DR SMART; SM00408; IGc2; 2. 

DR PROSITE; PS50835; IG_LIKE; 3. 

KW Immunoglobulin domain; Cell adhesion; Glycoprotein; GPI-anchor; 

KW Repeat; Signal; Lipoprotein. 

FT SIGNAL 1 28 POTENTIAL. 

FT CHAIN 29 315 LIMBIC SYSTEM-ASSOCIATED MEMBRANE 

FT PROTEIN. 

FT PROPEP 316 338 REMOVED IN MATURE FORM (POTENTIAL) . 



FT 


DOMAIN 


29 


122 


IG-LIKE C2-TYPE 1. 




FT 


DOMAIN 


132 


214 


IG-LIKE C2-TYPE 2. 




FT 


DOMAIN 


219 


304 


IG-LIKE C2-TYPE 3. 




FT 


DISULFID 


53 


111 


POTENTIAL 






FT 


DISULFID 


153 


197 


POTENTIAL 






FT 


DISULFID 


239 


290 


POTENTIAL 






FT 


CARBOHYD 


40 


40 


N-LINKED 


(GLCNAC. . 


\ ( POTTrMTT AT \ 
• ) \ c \J 1 HilN 1 J.^tJ_i ) . 


FT 


CARBOHYD 


66 


66 


N-LINKED 


(GLCNAC. . 


) ^PriTPMTTaT \ 
. ; \ i: \J 1 ZiVi 1 -Lrixj ) . 


FT 


CARBOHYD 


136 


136 


N-LINKED 


( GLCNAC . . 


) fPHTirMTTflT \ 
. ) \ r \J x HiIN 1 ±/\Jj } . 


FT 


CARBOHYD 


148 


148 


N-LINKED 


(GLCNAC. . 


\ (PnTPMTTAT \ 
• ) \ l-\J 1 Zilv X ±J\±j ) . 


FT 


CARBOHYD 


279 


27 9 


N-LINKED 


(GLCNAC. . 


^ / Pn r rTT 1 'NT r PT Z\T \ 
• } \ C \J 1 HiIN 1 ±/vLi ) . 


FT 


CARBOHYD 


287 


287 


N-LINKED 


( GLCNAC . . 


) f POTTrMTT AT \ 
• / ^ rui £j1n i iruj ^ • 


FT 


CARBOHYD 


300 


300 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


315 


315 


N-LINKED 


(GLCNAC. . 


.) (POTENTIAL). 


FT 


LIPID 


315 


315 


GPI-anchor amidated 


asparagine 


FT 








(Potential) . 




SQ 


SEQUENCE 


338 AA; 


37308 


MW; 03455F28 6DF5D92F 


CRC64; 



Query Match 8.2%; Score 144; DB 1; Length 338; 

Best Local Similarity 25.0%; Pred. No. 0.00026; 

Matches 56; Conservative 34; Mismatches 94; Indels 40; Gaps 10; 

Qy 53 SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGS CVPLERNQRYIFF — 105 

I : I I I : I : I I : : : : : I I I : I : I : 

Db 112 SVQTQHEPKTSQVYLIVQV PPKISNISSDVTVNEGSNVTLVCMANGRPEPVITWRH 167 

Qy 106 LEPTEQPLVFKTAFAPL DTNGKNLKKEVGKILCTDCAT RPKLKKMKSQ 153 

III: : : : : : | | | : : | I : : I I 

Db 168 LTPTGREFEGEEEYLEILGITREQSGKYECKAANEVSSADVKQVKVTVNYPPTITESKSN 227 

Qy 154 TGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVED 213 

I : I I I I I I : I I I : I : : I : | : : I I I : II I | 

Db 22 8 EATTGRQASLKCEASA-VPAPDFEWYRDDTRINSANGLEIKSTEGQ — SSLTVTNVTEEH 2 84 

Qy 214 AGEYVCEAENILG KDTVRGRLYVN- SVSTTLSSW 246 

I I I I I I I : I I I : I | : | : | 
Db 285 YGNYTCVAANKLGVTNAS LVLFRPGSVRG INGSISLAVPLW 325 



RESULT 12 
CEPU_CHICK 

ID CEPU_CHICK STANDARD; PRT; 353 AA. 

AC Q90773; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE CEPU-1 protein precursor. 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves ; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 

RC TISSUE^Brain; 

RX MEDLINE=96370549; PubMed=8774 4 45 ; 

RA Spaltmann F . , Bruemmendorf T-; 
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"CEPU-1, a novel immunoglobulin superf amily molecule, is expressed by 
developing cerebellar Purkinje cells."; 
J. Neurosci. 16:1770-177 9(1996). 

-!- FUNCTION: It may be a cellular address molecule specific to 
Purkinje cells. It may represent a receptor or a subunit of a 
receptor complex. 

SUBCELLULAR LOCATION : Attached to the membrane by a GPI-anchor. 
ALTERNATIVE PRODUCTS: 
Event=Alternative splicing; Named isoforms=2; 
Name=l; Synonyms=Minor ; 

IsoId=Q90773-l; Sequence=Displayed; 
Name =2 ; Synonyms=Ma j or ; 

IsoId=Q90773-2; Sequence=VSP_002607 ; 
-!- TISSUE SPECIFICITY: Found on the dendrites, somata and axons of 
developing Purkinje cells. Undetectable on other neurons like 
Golgi or granule cells. 
-!- DEVELOPMENTAL STAGE: Expressed by developing cerebellar Purkinje 
cells. Expression coincides with the growth of the dendritic tree, 
after Purkinje cells have finished their migration from the 
ventricular zone (from E15 until E21). Expressed in the adult. 
-!- SIMILARITY: Belongs to the immunoglobulin superf amily . IgLON 
family. 

-!- SIMILARITY: Contains 3 immunoglobulin- like C2-type domains. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; Z72497; CAA96578.1; -. 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
Pfam; PF00047; ig; 3. 
SMART; SM00408; IGc2; 2. 
PROSITE; PS50835; IG_LIKE; 3. 

Immunoglobulin domain; Cell adhesion; Glycoprotein; GPI-anchor; 
Repeat; Signal; Alternative splicing; Lipoprotein. 
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FT /FTId=VSP_002607. 

SQ SEQUENCE 353 AA; 38736 MW; 2550C4 8591EBBBA6 CRC64; 

Query Match 8.1%; Score 141.5; DB 1; Length 353; 

Best Local Similarity 36.5%; Pred. No. 0.00043; 

Matches 38; Conservative 11; Mismatches 52; Indels 3; Gaps 3; 

Qy 145 PKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRL 204 

I : II I I : I I I I I : I I : : I : I I I I : : | | | ! I 

Db 221 PYISDAKSTGVPVGQKGILMCEASA-VPSADFQWYKDDKRLAEGQK-GLKVENKAFFSRL 278 

Qy 205 QFNKVWEDAGEYVCEAEN1LGKDTVRGRLYVNSVSTTLSSWSG 24 8 

I I : I I I I I I I I II : | | : | | 

Db 27 9 T FFNVS EQD YGN YTCVASNQLGNTNASMI L Y- EETTTALT PWKG 321 



RESULT 13 
NCA2_HUMAN 

ID NCA2_HUMAN STANDARD; PRT; 761 AA. 

AC P13592; P13593; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT 01-APR-1990 (Rel. 14, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Neural cell adhesion molecule 1, 120 kDa isoform precursor (N-CAM 120) 

DE (NCAM-120) (CD56 antigen) . 

GN NCAM1 OR NCAM. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM N-CAM 120) . 

RC TISSUE=Skeletal muscle; 

RX MEDLINE=89305258; PubMed=3253057 ; 

RA Barton C.H., Dickson G., Gower H.J., Rowett L.H., Putt W., 

RA ElsomV., Moore S.E., Goridis C, Walsh F.S.; 

RT "Complete sequence and in vitro expression of a tissue-specific 

RT phosphatidylinositol-linked N-CAM isoform from skeletal muscle."; 

RL Development 104:165-173(1988). 

RN [2] 

RP SEQUENCE OF 491-761 FROM N.A. (ISOFORM N-CAM 120) . 

RC TISSUE=Skeletal muscle; 

RX MEDLINE=87301755; PubMed=2 8 872 95 ; 

RA Dickson G., Gower H.J., Barton C.H., Prentice H.M. , ElsomV. L., 

RA Moore S.E., Cox R.D., Quinn C, Putt W. , Walsh F.S.; 

RT "Human muscle neural cell adhesion molecule (N-CAM) : identification 

RT of a muscle-specific sequence in the extracellular domain."; 

RL Cell 50:1119-1130(1987). 

RN [3] 

RP SEQUENCE OF 491-655 FROM N.A. (ISOFORM C) . 

RX MEDLINE=89077552; PubMed=3203385 ; 

RA Gower H.J., Barton C.H., Elsom V.L., Thompson J., Moore S.E., 

RA Dickson G., Walsh F.S.; 

RT "Alternative splicing generates a secreted form of N-CAM in muscle 

RT and brain. 11 ; 

RL Cell 55:955-964(1988). 

CC -!- FUNCTION: This protein is a cell adhesion molecule involved in 
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neuron-neuron adhesion, neurite f asciculation, outgrowth of 
neurites, etc. 

-!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 
-!- ALTERNATIVE PRODUCTS: 

Event=Alternative splicing; Named isoforms=3; 
Name=N-CAM 120; 

IsoId=P13592-2; Sequence=Displayed; 
Name=N-CAM 140; 

IsoId=Pl3591-l ; Sequence=External ; 
Name=C; Synonyms=Secreted; 

IsoId=P13592-l; Sequence=VSP_002587 ; 
SIMILARITY: Contains 5 immunoglobulin-like C2-type domains. 
SIMILARITY: Contains 2 fibronectin type III domains. 
DATABASE: NAME=PROW; NOTE=CD guide CDS 6 entry; 
WWW= "http : / / www . ncbi . nlm . nih . gov/prow/ cd/ cd5 6 . htm" . 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; X16841; CAA34739.1; -. 

EMBL; M17409; AAA59912.1; -. 

EMBL; M22094; AAA59910.1; -. 

EMBL; M22092; AAA59911.1; 

EMBL; M22091; AAA59911.1; JOINED. 

PIR; A31635; A31635. 

PIR; S07784; IJHUNG. 

Genew; HGNC:7 656; NCAM1 . 

MIM; 116930; -. 

GO; GO: 0016021; C: integral to membrane; TAS . 

GO; GO:0005886; C: plasma membrane; TAS. 

InterPro; IPR008957; FN_III-like. 

InterPro; IPR003961; FN_III. 

InterPro; IPR007110; Ig-like. 

InterPro; IPR003598; Ig_c2. 

Pfam; PF00041; fn3; 2. 

Pfam; PF00047; ig; 5. 

SMART; SM00060; FN3; 2. 

SMART; SM00408; IGc2; 5. 

PROSITE; PS50835; IG_LIKE; 5. 

Immunoglobulin domain; Cell adhesion; Glycoprotein; Repeat; Signal; 
GPI-anchor; Alternative splicing. 
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SQ 


SEQUENCE 


761 AA; 


83770 


MW; F0CAD3292D7AB67E CRC64 ; 



Query Match 8.0%; Score 140; DB 1; Length 761; 

Best Local Similarity 25.4%; Pred. No. 0.0014; 

Matches 45; Conservative 26; Mismatches 60; Indels 46; Gaps 7; 

Qy 154 TGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVED 213 

I :|: :| I: I I |:|: I III::: : I II :|:| II | 

Db 224 TANLGQSVTLVCD-AEGFPEPTMSWTKDGEQIEQEEDDE-KYIFSDDSSQLTIKKVDKND 281 

Qy 214 AGEYVCEAENILGKD--TVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIE 271 

I I : I I I I I : I : : : : | : | : | | : | 

Db 2 82 EAEYICIAENKAGEQDATIHLKVFAKPKITYVE NQTAME LE 322 

Qy 272 GINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLD 32 8 

I : I : It 1=11 : I I : I I I 

Db 323 EQVTLTCEASG DPI PSITWRTSTRNI SSEE KTLD 356 



RESULT 14 
NCA1_HUMAN 

ID NCA1_HUMAN STANDARD; PRT; 84 8 AA. 

AC P13591; Q15829; Q16180; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Neural cell adhesion molecule 1, 140 kDa isoform precursor (N-CAM 140) 

DE (NCAM-140) (CD56 antigen) . 

GN NCAM1 OR NCAM. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=94356433; PubMed=8075973 ; 

RA Saito S., Tanio Y., Tachibana I., Hayashi S., Kishimoto T., Kawase I.; 

RT "Complementary DNA sequence encoding the major neural cell adhesion 

RT molecule isoform in a human small cell lung cancer cell line."; 

RL Lung Cancer 10:307-318(1994). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=91250739; PubMed=17 10251 ; 

RA Lanier L.L., Chang C, Azuma M. , Ruitenberg J. J., Hemperly J. J., 



RA Phillips J.H.; 

RT "Molecular and functional analysis of human natural killer cell- 

RT associated neural cell adhesion molecule (N-CAM/CD56) . " ; 

RL J. Immunol. 146:4421-4426(1991). 

RN [3] 

RP SEQUENCE OF 491-848 FROM N.A. 

RX MEDLINE=87301755; PubMed=2 8 872 95 ; 

RA Dickson G., Gower H.J., Barton C.H., Prentice H.M. f Elsom V.L., 

RA Moore S.E., Cox R.D., Quinn C, Putt W., Walsh F.S.; 

RT "Human muscle neural cell adhesion molecule (N-CAM) : identification 

RT of a muscle-specific sequence in the extracellular domain."; 

RL Cell 50:1119-1130(1987). 

CC -!- FUNCTION: This protein is a cell adhesion molecule involved in 
CC neuron-neuron adhesion, neurite f asciculation, outgrowth of 

CC neurites, etc. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=3; 

CC Name=N-CAM 14 0; 

CC IsoId=P13591-l; Sequence=Displayed; 

CC Name-N-CAM 120; 

CC IsoId=P13592-2; Sequence=External ; 

CC Name=C; Synonyms=Secreted; 

CC IsoId=P13592-l; Sequence=External ; 

CC -!- SIMILARITY: Contains 5 immunoglobulin-like C2-type domains. 

CC -!- SIMILARITY: Contains 2 fibronectin type III domains. 

CC -!- DATABASE: NAME=PROW; NOTE=CD guide CD56 entry; 

CC WWW="http : //www. ncbi . nlm. nih . gov/prow/cd/cd56 . htm" . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; S71824; AAB31836.1; -. 

DR EMBL; U63041; AAB04558.1; -. 

DR EMBL; M17410; AAA59913.1; -. 

DR HSSP; P40189; 1BQU. 

DR Genew; HGNC:765 6; NCAM1 . 

DR MIM; 116930; 

DR GO; GO:0016021; C:integral to membrane; TAS . 

DR GO; GO: 0005886; C:plasma membrane; TAS. 

DR InterPro; IPR008957; FN_III-like. 

DR InterPro; IPR003961; FNJEII. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR .Pfam; PF00041; fn3; 2. 

DR Pfam; PF00047; ig; 5. 

DR SMART; SM00060; FN3; 2. 

DR SMART; SM00408; IGc2; 5. 

DR PROSITE; PS50835; IG_LIKE; 5. 

KW Immunoglobulin domain; Cell adhesion; Glycoprotein; Repeat; Signal; 

KW Transmembrane; Alternative splicing. 
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SQ 


SEQUENCE 


848 AA; 


93360 


MW; 68D2F0C0E6C1C2AD CRC64; 



Query Match 8.0%; Score 140; DB 1; Length 848; 

Best Local Similarity 25.4%; Pred. No. 0.0016; 

Matches 45; Conservative 26; Mismatches 60; Indels 46; Gaps 7; 

Qy 154 TGQVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVED 213 

I : I : : I I : I I I : I : I I I I : : : : I I I : | : | | | | 

Db 224 T7\NLGQSVTLVCD-AEGFPEPTMSWTKDGEQIEQEEDDE-KYIFSDDSSQLTIKKVDKND 281 

Qy 214 AGEYVCEAENI LGKD — TVRGRLYVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIE 271 

II : I I M I : I : ::: I : | : | | : | 

Db 282 EAEYICIAENKAGEQDATIHLKVFAKPKITYVE NQTAME LE 322 

Qy 272 GINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQSVLWDTPGTGVSSSQWSTSPSTLD 328 

1:1: 111:11 : I I : II I 

Db 323 EQVTLTCEASG DPIPSITWRTSTRNISSEE KTLD 356 



RESULT 15 
UN89_CAEEL 

ID UN89_CAEEL STANDARD; PRT; 6632 AA. 

AC 001761; Q17362; 

DT 10-OCT-2003 (Rel. 42, Created) 

DT 10-OCT-2003 (Rel. 42, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Muscle M-line assembly protein unc-89 (Uncoordinated protein 89) . 
GN UNC-89 OR C09D1.1. 



OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematocla; Chromadorea ; Rhabditida; Rhabditoidea ; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=6239; 
RN [1] 

RP SEQUENCE FROM N. A., FUNCTION, AND TISSUE SPECIFICITY. 

RC STRAIN=Bristol N2 ; 

RX MEDLINE=96180278; PubMed=8603916; 

RA Benian G.M., Tinley T.L., Tang X. , Borodovsky M. ; 

RT "The Caenorhabditis elegans gene unc-89, required for muscle M-line 

RT assembly, encodes a giant modular protein composed of Ig and signal 

RT transduction domains."; 

RL J. Cell Biol. 132:835-848(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2; 

RA Du Z., Le T.T., Wilson R. ; 

RL Submitted (MAY-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP REVISIONS. 

RA Waterston R. ; 

RL Submitted (APR-2002) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Structural component of the muscle M-line. Myofilament 
CC lattice assembly begins with positional cues laid down in the 

CC basement membrane and muscle cell membrane. UNC-8 9 responds to 

CC these signals, localizes, and then participates in assembling an 

CC M-line. 

CC -!- TISSUE SPECIFICITY: Localizes to the middle of A-bands . 

CC -!- SIMILARITY: Contains 1 DBL-homology (DH) domain. 

CC -!- SIMILARITY: Contains 1 fibronectin type III domain. 

CC -!- SIMILARITY: Contains 49 immunoglobulin-like C2-type domains. 

CC -!- SIMILARITY: Contains 1 PH domain. 

CC -!- SIMILARITY: Contains 5 RCSD domains. 

CC -!- SIMILARITY: Contains 1 SH3 domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 
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DR EMBL; U33058; AAB00542.1; -. 

DR EMBL; AF003131; AAB54132.2; -. 

DR PDB; 1FH0; 20-DEC-00. 

DR WormPep; C09D1.1; CE30426. 

DR InterPro; IPR008957; FN_III-like. 

DR InterPro; IPR003961; FN_III. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR InterPro; IPR003006; IgJVTHC. 

DR InterPro; IPR001849; PH. 

DR InterPro; IPR007850; RCSD. 

DR InterPro; IPR000219; RhoGEF. 

DR InterPro; IPR001452; SH3 . 

DR Pfam; PF00041; fn3; 1. 
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Query Match 8.0%; Score 139.5; DB 1; Length 6632; 

Best Local Similarity 29.0%; Pred. No. 0.021; 

Matches 54; Conservative 18; Mismatches 65; Indels 49; Gaps 8; 
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oonn MM 1 hl :| 1:11 |: 1 :| :: 
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: I 'I : I I : : I | I I I I I I 
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